gpt4 book ai didi

azure-databricks - Databricks 版本 7.0 的行为不像版本 6.3 : class java. lang.Long 不能转换为类 java.lang.Integer

转载 作者:行者123 更新时间:2023-12-05 07:06:27 26 4
gpt4 key购买 nike

我在 azure databricks 版本 6.3 - Spark 2.4.4 上有一个工作笔记本

此笔记本使用它的连接器将数据引入 Azure Synapse Analytics

当我将笔记本升级到版本 7.0 - Spark 3.0.0 时,进程开始失败并出现以下错误:

com.microsoft.sqlserver.jdbc.SQLServerException:HdfsBridge::recordReaderFillBuffer - Unexpected error encounteredfilling record reader buffer: ClassCastException: class java.lang.Longcannot be cast to class java.lang.Integer (java.lang.Long andjava.lang.Integer are in module java.base of loader 'bootstrap')[ErrorCode = 106000] [SQLState = S0001]

这是 Synapse Analytics 中的表架构:

CREATE TABLE [dbo].[IncrementalDestination]
(
[Id] [int] NOT NULL,
[VarChar] [varchar](1000) NULL,
[Char] [char](1000) NULL,
[Text] [varchar](1000) NULL,
[NVarChar] [nvarchar](1000) NULL,
[NChar] [nchar](1000) NULL,
[NText] [nvarchar](1000) NULL,
[Date] [date] NULL,
[Datetime] [datetime] NULL,
[Datetime2] [datetime2](7) NULL,
[Smalldatetime] [smalldatetime] NULL,
[Bigint] [bigint] NULL,
[Bit] [bit] NULL,
[Decimal] [decimal](18, 0) NULL,
[Int] [int] NULL,
[Money] [money] NULL,
[Numeric] [numeric](18, 0) NULL,
[Smallint] [smallint] NULL,
[Smallmoney] [smallmoney] NULL,
[Tinyint] [tinyint] NULL,
[Float] [float] NULL,
[Real] [real] NULL,
[Column With Space] [varchar](1000) NULL,
[Column_ç_$pecial_char] [varchar](1000) NULL,
[InsertionDateUTC] [datetime] NOT NULL,
[De_LastUpdated] [datetime2](3) NOT NULL
)
WITH
(
DISTRIBUTION = ROUND_ROBIN,
CLUSTERED COLUMNSTORE INDEX
)
GO

这是 Databricks 在读取 Azure BlobStorage 中的一堆 Parquet 后生成的模式

root
|-- Id: long (nullable = true)
|-- VarChar: string (nullable = true)
|-- Char: string (nullable = true)
|-- Text: string (nullable = true)
|-- NVarChar: string (nullable = true)
|-- NChar: string (nullable = true)
|-- NText: string (nullable = true)
|-- Date: timestamp (nullable = true)
|-- Datetime: timestamp (nullable = true)
|-- Datetime2: timestamp (nullable = true)
|-- Smalldatetime: timestamp (nullable = true)
|-- Bigint: long (nullable = true)
|-- Bit: boolean (nullable = true)
|-- Decimal: long (nullable = true)
|-- Int: long (nullable = true)
|-- Money: double (nullable = true)
|-- Numeric: long (nullable = true)
|-- Smallint: long (nullable = true)
|-- Smallmoney: double (nullable = true)
|-- Tinyint: long (nullable = true)
|-- Float: double (nullable = true)
|-- Real: double (nullable = true)
|-- Column_With_Space: string (nullable = true)
|-- Column_ç_$pecial_char: string (nullable = true)
|-- InsertionDateUTC: timestamp (nullable = true)
|-- De_LastUpdated: timestamp (nullable = false)

我看到了

Int: long (nullable = true)

但是我能做什么呢?

这种转换不应该是自然且容易完成的吗?

我认为这些新功能有些问题 =]

最佳答案

我相信这是由以下变化引起的,如described in migration guide :

In Spark 3.0, when inserting a value into a table column with a different data type, the type coercion is performed as per ANSI SQL standard. Certain unreasonable type conversions such as converting string to int and double to boolean are disallowed. A runtime exception is thrown if the value is out-of-range for the data type of the column. In Spark version 2.4 and below, type conversions during table insertion are allowed as long as they are valid Cast. When inserting an out-of-range value to an integral field, the low-order bits of the value is inserted(the same as Java/Scala numeric type casting). For example, if 257 is inserted to a field of byte type, the result is 1. The behavior is controlled by the option spark.sql.storeAssignmentPolicy, with a default value as “ANSI”. Setting the option as “Legacy” restores the previous behavior.

因此您可以尝试将 spark.sql.storeAssignmentPolicy 设置为 Legacy 并重新运行代码。

关于azure-databricks - Databricks 版本 7.0 的行为不像版本 6.3 : class java. lang.Long 不能转换为类 java.lang.Integer,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62492265/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com