gpt4 book ai didi

azure - Databricks - 写入 Azure Synapse 时出错

转载 作者:行者123 更新时间:2023-12-03 03:49:46 25 4
gpt4 key购买 nike

我正在尝试使用以下代码将数据写入具有标识字段的 Azure Synapse 表

数据 block 上的代码

def get_jdbc_connection(host, sqlDatabase, user, password):
jdbcHostname = "{}.database.windows.net".format(host)
jdbc_url = "jdbc:sqlserver://{}:1433;database={};user={}@{};password={};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;".format(jdbcHostname, sqlDatabase, user, host, password)
url = "jdbc:sqlserver://{}:1433;database={};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;".format(jdbcHostname, sqlDatabase)
return (jdbc_url,url )

def write_adw(spark, df_target_adw, jdbc_url, table, tempDir, option_mode, pre_Actions ):
df_target_adw.write.format("com.databricks.spark.sqldw") \
.option("url", jdbc_url) \
.option("useAzureMSI", "true") \
.option("preActions", pre_Actions) \
.option("dbTable", table) \
.option("tempDir", tempDir) \
.mode(option_mode) \
.save()

dftraffic = spark.sql('SELECT distinct SourceName\
,1 AS IsActiveRow \
,"Pipe-123" as pipelineId \
,current_timestamp as ADFCreatedDateTime \
,current_timestamp as ADFModifiedDateTime \
from deltaTable')

#write to ADW
(jdbc_url, url_adw) = get_jdbc_connection(host, sqlDatawarehouse,user, password)
target_table = 'TargetTable_name'
option_mode= "append"
pre_Actions= " SELECT GETDATE()"
write_adw(spark, dftraffic, jdbc_url, target_table, tempDir, option_mode, pre_Actions )

adw 上目标表的架构

<表类=“s-表”><标题>列名称数据类型 <正文>源SIDINT IDENTITY (1,1) NOT NULL来源名称VARCHAR(20) NOT NULLIsRowActive位不为空管道IDVARCHAR(20) NOT NULLADF创建日期时间日期时间不为空ADF修改日期时间日期时间不为空

databricks 上的配置详细信息

Databricks runtime 7.4 (includes Apache Spark 3.0.1, Scala 2.12)

错误消息

Py4JJavaError: An error occurred while calling o457.save.: com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector.Underlying SQLException(s): - com.microsoft.sqlserver.jdbc.SQLServerException: An explicit value for the identity column in table can only be specified when a column list is used and IDENTITY_INSERT is ON

代码在 databricks 运行时 6.4 Spark 2.4.5 上运行得很好,当我尝试升级 dbk 运行时时,我就遇到了这个错误。我怎样才能让它发挥作用?

最佳答案

您是否没有额外的行“1 AS IsActiveRow”。我在架构中没有看到该行

dftraffic = spark.sql('SELECT distinct SourceName\
,1 AS IsActiveRow \
,"Pipe-123" as pipelineId \
,current_timestamp as ADFCreatedDateTime \
,current_timestamp as ADFModifiedDateTime \
from deltaTable)

关于azure - Databricks - 写入 Azure Synapse 时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67205004/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com