作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我有一个包含 N 个字段的数据框,如下所述。列数和值的长度会有所不同。
输入表:
+--------------+-----------+-----------------------+
|Date |Amount |Status |
+--------------+-----------+-----------------------+
|2019,2018,2017|100,200,300|IN,PRE,POST |
|2018 |73 |IN |
|2018,2017 |56,89 |IN,PRE |
+--------------+-----------+-----------------------+
+-------------+------+---------+
|Date |Amount|Status| Sequence|
+------+------+------+---------+
|2019 |100 |IN | 1 |
|2018 |200 |PRE | 2 |
|2017 |300 |POST | 3 |
|2018 |73 |IN | 1 |
|2018 |56 |IN | 1 |
|2017 |89 |PRE | 2 |
+-------------+------+---------+
var df = dataRefined.withColumn("TOT_OVRDUE_TYPE", explode(split($"TOT_OVRDUE_TYPE", "\\"))).toDF
var df1 = df.withColumn("TOT_OD_TYPE_AMT", explode(split($"TOT_OD_TYPE_AMT", "\\"))).show
最佳答案
您可以通过使用 来实现这一点。数据框内置函数 arrays_zip
, split
, posexplode
Explanation:
scala>val df=Seq((("2019,2018,2017"),("100,200,300"),("IN,PRE,POST")),(("2018"),("73"),("IN")),(("2018,2017"),("56,89"),("IN,PRE"))).toDF("date","amount","status")
scala>:paste
df.selectExpr("""posexplode(
arrays_zip(
split(date,","), //split date string with ',' to create array
split(amount,","),
split(status,","))) //zip arrays
as (p,colum) //pos explode on zip arrays will give position and column value
""")
.selectExpr("colum.`0` as Date", //get 0 column as date
"colum.`1` as Amount",
"colum.`2` as Status",
"p+1 as Sequence") //add 1 to the position value
.show()
Result:
+----+------+------+--------+
|Date|Amount|Status|Sequence|
+----+------+------+--------+
|2019| 100| IN| 1|
|2018| 200| PRE| 2|
|2017| 300| POST| 3|
|2018| 73| IN| 1|
|2018| 56| IN| 1|
|2017| 89| PRE| 2|
+----+------+------+--------+
关于scala - 如何将逗号分隔的多列拆分为多行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56806926/
我是一名优秀的程序员,十分优秀!