gpt4 book ai didi

nested - Spark SQL - 访问嵌套结构 Row(field1, field2=Row(..))

转载 作者:行者123 更新时间:2023-12-04 20:38:33 25 4
gpt4 key购买 nike

我需要使用 sql 方法在 SparkSQL 中使用嵌套结构的帮助。我在现有 RDD (dataRDD) 之上创建了一个数据框,其结构如下:

schema=StructType([ StructField("m",LongType()) ,
StructField("field2", StructType([
StructField("st",StringType()),
StructField("end",StringType()),
StructField("dr",IntegerType()) ]) )
])

printSchema() 返回这个:
root
|-- m: long (nullable = true)
|-- field2: struct (nullable = true)
| |-- st: string (nullable = true)
| |-- end: string (nullable = true)
| |-- dr: integer (nullable = true)

从数据 RDD 创建数据框并应用模式效果很好。
df= sqlContext.createDataFrame( dataRDD, schema )
df.registerTempTable( "logs" )

但是检索数据不起作用:
res = sqlContext.sql("SELECT m, field2.st FROM logs") # <- This fails 

...org.apache.spark.sql.AnalysisException: cannot resolve 'field.st' given input columns msisdn, field2;

res = sqlContext.sql("SELECT m, field2[0] FROM logs") # <- Also fails
...org.apache.spark.sql.AnalysisException: unresolved operator 'Project [field2#1[0] AS c0#2];

res = sqlContext.sql("SELECT m, st FROM logs") # <- Also not working
...cannot resolve 'st' given input columns m, field2;

那么如何在 SQL 语法中访问嵌套结构呢?
谢谢

最佳答案

您在测试中发生了其他事情,因为 field2.st是正确的语法:

case class field2(st: String, end: String, dr: Int)

val schema = StructType(
Array(
StructField("m",LongType),
StructField("field2", StructType(Array(
StructField("st",StringType),
StructField("end",StringType),
StructField("dr",IntegerType)
)))
)
)

val df2 = sqlContext.createDataFrame(
sc.parallelize(Array(Row(1,field2("this","is",1234)),Row(2,field2("a","test",5678)))),
schema
)

/* df2.printSchema
root
|-- m: long (nullable = true)
|-- field2: struct (nullable = true)
| |-- st: string (nullable = true)
| |-- end: string (nullable = true)
| |-- dr: integer (nullable = true)
*/

val results = sqlContext.sql("select m,field2.st from df2")

/* results.show
m st
1 this
2 a
*/

回头看看你的错误信息: cannot resolve 'field.st' given input columns msisdn, field2 -- field对比 field2 .再次检查您的代码 - 名称未对齐。

关于nested - Spark SQL - 访问嵌套结构 Row(field1, field2=Row(..)),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30750616/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com