gpt4 book ai didi

scala - 指定的分区列与表格的分区列不匹配。请使用()作为分区列

转载 作者:行者123 更新时间:2023-12-02 19:24:22 27 4
gpt4 key购买 nike

在这里,我试图将数据帧保存到分区的配置单元表中,并得到此愚蠢的异常。我已经看了很多遍了,但是找不到故障。

org.apache.spark.sql.AnalysisException: Specified partition columns (timestamp value) do not match the partition columns of the table. Please use () as the partition columns.;



这是用来创建外部表的脚本,
CREATE EXTERNAL TABLEIF NOT EXISTS events2 (
action string
,device_os_ver string
,device_type string
,event_name string
,item_name string
,lat DOUBLE
,lon DOUBLE
,memberid BIGINT
,productupccd BIGINT
,tenantid BIGINT
) partitioned BY (timestamp_val DATE)
row format serde 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
stored AS inputformat 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
outputformat 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
location 'maprfs:///location/of/events2'
tblproperties ('serialization.null.format' = '');

这是描述表“ events2 ”格式的结果
hive> describe  formatted events2;
OK
# col_name data_type comment

action string
device_os_ver string
device_type string
event_name string
item_name string
lat double
lon double
memberid bigint
productupccd bigint
tenantid bigint

# Partition Information
# col_name data_type comment

timestamp_val date

# Detailed Table Information
Database: default
CreateTime: Wed Jan 11 16:58:55 IST 2017
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: maprfs:/location/of/events2
Table Type: EXTERNAL_TABLE
Table Parameters:
EXTERNAL TRUE
serialization.null.format
transient_lastDdlTime 1484134135

# Storage Information
SerDe Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
serialization.format 1
Time taken: 0.078 seconds, Fetched: 42 row(s)

这是将数据分区并存储到表中的代码行,
val tablepath = Map("path" -> "maprfs:///location/of/events2")

AppendDF.write.format("parquet").partitionBy("Timestamp_val").options(tablepath).mode(org.apache.spark.sql.SaveMode.Append).saveAsTable("events2")

在运行应用程序时,我得到以下信息

Specified partition columns (timestamp_val) do not match the partition columns of the table.Please use () as the partition columns.



我可能犯了一个明显的错误,任何帮助都非常感谢upvote :)

最佳答案

请打印df的架构:

AppendDF.printSchema()

确保它不是类型不匹配??

关于scala - 指定的分区列与表格的分区列不匹配。请使用()作为分区列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41605897/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com