gpt4 book ai didi

azure - 数据 block : I met with an issue when I was trying to use autoloader to read json files from Azure ADLS Gen2

转载 作者:行者123 更新时间:2023-12-03 03:31:29 28 4
gpt4 key购买 nike

当我尝试使用自动加载器从 Azure ADLS Gen2 读取 json 文件时遇到问题。我仅针对特定文件遇到此问题。我检查过文件完好并且没有损坏。

问题如下:

Caused by: java.lang.IllegalArgumentException: ***requirement failed: Literal must have a corresponding value to string, but class Integer found.***
at scala.Predef$.require(Predef.scala:281)
at at ***com.databricks.sql.io.FileReadException: Error while reading file /mnt/Source/kafka/customer_raw/filtered_data/year=2022/month=11/day=9/hour=15/part-00000-31413bcf-0a8f-480f-8d45-6970f4c4c9f7.c000.json.***
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.logFileNameAndThrow(FileScanRDD.scala:598)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:422)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(null:-1)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
java.lang.IllegalArgumentException: requirement failed: Literal must have a corresponding value to string, but class Integer found.
at scala.Predef$.require(Predef.scala:281)
at org.apache.spark.sql.catalyst.expressions.Literal$.validateLiteralValue(literals.scala:274)
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.sat java.lang.Thread.run(Thread.java:750)

我正在使用 Delta Live Pipeline。这是代码:

@dlt.table(name = tablename,
comment = "Create Bronze Table",
table_properties={
"quality": "bronze"
}
)
def Bronze_Table_Create():
return
spark
.readStream
.schema(schemapath)
.format("cloudFiles")
.option("cloudFiles.format","json)
.option("cloudFile.schemaLocation, schemalocation)
.option("cloudFiles.inferColumnTypes", "false")
.option("cloudFiles.schemaEvolutionMode", "rescue")
.load(sourcelocation

最佳答案

我已经解决了这个问题。问题是我们错误地在架构文件中存在重复的列。因此它显示了该错误。然而,这个错误完全是误导性的,这就是为什么无法纠正它。

关于azure - 数据 block : I met with an issue when I was trying to use autoloader to read json files from Azure ADLS Gen2,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74650227/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com