当前位置: 代码迷 >> 综合 >> 【Pyspark】 分割 一行中的list分割转为多列 explode
  详细解决方案

【Pyspark】 分割 一行中的list分割转为多列 explode

热度:58   发布时间:2024-01-26 09:40:33.0

 

官方例子:Python pyspark.sql.functions.explode() Examples

https://www.programcreek.com/python/example/98237/pyspark.sql.functions.explode

 

根据某个字段内容进行分割,然后生成多行,这时可以使用explode方法

Eg:

df.explode("c3","c3_"){time: String => time.split(" ")}.show(False)

https://blog.csdn.net/anshuai_aw1/article/details/87881079#4.4%C2%A0%E5%88%86%E5%89%B2%EF%BC%9A%E8%A1%8C%E8%BD%AC%E5%88%97

 

Eg:

from pyspark.sql import Row
eDF = sqlContext.createDataFrame([Row(a=1, intlist=[1,2,3], mapfield={"a": "b"})])
eDF.select(explode(eDF.intlist).alias("anInt")).collect()
Out: [Row(anInt=1), Row(anInt=2), Row(anInt=3)]

 

来自 <http://spark.apache.org/docs/1.6.0/api/python/pyspark.sql.html?highlight=except#pyspark.sql.functions.exp>

 

Eg:

from pyspark.sql import Rowfrom pyspark.sql.functions import explodeeDF = spark.createDataFrame([Row(a=1, intlist=[1,2,3], mapfield={"a": "b"})])eDF.show()eDF.select(explode(eDF.intlist).alias("anInt")).show()

 

Eg:

df2=df1.select(explode(df1.line).alias("line_new_name")) 

 

  相关解决方案