gpt4 book ai didi

azure - ADF V2 - 基于表列参数化数据复制管道

转载 作者:行者123 更新时间:2023-12-03 04:16:43 25 4
gpt4 key购买 nike

通过门户使用 Azure 数据工厂 V2

https://adf.azure.com

我创建了一个管道,用于将多个表中的数据从一个 Azure SQL 数据库增量复制到另一个 Azure SQL 数据库。

为了创建它,我根据我的需要调整了以下示例: Incrementally load data from multiple tables

以下是与创建的管道相关的json文件:

{
"name": "IncrementalCopyPipeline",
"properties": {
"activities": [
{
"name": "IterateSQLTables",
"type": "ForEach",
"typeProperties": {
"items": {
"value": "@pipeline().parameters.tableList",
"type": "Expression"
},
"activities": [
{
"name": "LookupOldWaterMarkActivity",
"type": "Lookup",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": {
"value": "select * \nfrom watermarktable \nwhere TableName = '@{item().TABLE_NAME}'",
"type": "Expression"
}
},
"dataset": {
"referenceName": "WatermarkDataset",
"type": "DatasetReference"
}
}
},
{
"name": "LookupNewWaterMarkActivity",
"type": "Lookup",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": {
"value": "select MAX(@{item().WaterMark_Column}) as NewWatermarkvalue \nfrom @{item().TABLE_NAME}",
"type": "Expression"
}
},
"dataset": {
"referenceName": "SourceDataset",
"type": "DatasetReference"
}
}
},
{
"name": "IncrementalCopyActivity",
"type": "Copy",
"dependsOn": [
{
"activity": "LookupNewWaterMarkActivity",
"dependencyConditions": [
"Succeeded"
]
},
{
"activity": "LookupOldWaterMarkActivity",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": {
"value": "select * from @{item().TABLE_NAME} \nwhere @{item().WaterMark_Column} > '@{activity('LookupOldWaterMarkActivity').output.firstRow.WatermarkValue}' and @{item().WaterMark_Column} <= '@{activity('LookupNewWaterMarkActivity').output.firstRow.NewWatermarkvalue}'",
"type": "Expression"
}
},
"sink": {
"type": "SqlSink",
"writeBatchSize": 10000,
"sqlWriterStoredProcedureName": {
"value": "@{item().StoredProcedureNameForMergeOperation}",
"type": "Expression"
},
"sqlWriterTableType": {
"value": "@{item().TableType}",
"type": "Expression"
}
},
"enableStaging": false,
"dataIntegrationUnits": 0
},
"inputs": [
{
"referenceName": "SourceDataset",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "SinkDataset",
"type": "DatasetReference",
"parameters": {
"SinkTableName": "@{item().TABLE_NAME}"
}
}
]
},
{
"name": "StoredProceduretoWriteWatermarkActivity",
"type": "SqlServerStoredProcedure",
"dependsOn": [
{
"activity": "IncrementalCopyActivity",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"storedProcedureName": "[dbo].[sp_write_watermark]",
"storedProcedureParameters": {
"LastModifiedtime": {
"value": {
"value": "@{activity('LookupNewWaterMarkActivity').output.firstRow.NewWatermarkvalue}",
"type": "Expression"
},
"type": "DateTime"
},
"TableName": {
"value": {
"value": "@{activity('LookupOldWaterMarkActivity').output.firstRow.TableName}",
"type": "Expression"
},
"type": "String"
}
}
},
"linkedServiceName": {
"referenceName": "SqlServerLinkedService_dest",
"type": "LinkedServiceReference"
}
}
]
}
}
],
"parameters": {
"tableList": {
"type": "Object",
"defaultValue": [
{
"TABLE_NAME": "customer_table",
"WaterMark_Column": "LastModifytime",
"TableType": "DataTypeforCustomerTable",
"StoredProcedureNameForMergeOperation": "sp_upsert_customer_table"
},
{
"TABLE_NAME": "project_table",
"WaterMark_Column": "Creationtime",
"TableType": "DataTypeforProjectTable",
"StoredProcedureNameForMergeOperation": "sp_upsert_project_table"
}
]
}
}
}
}

在我的表中,我有一列区分不同的公司,因此我想向此管道添加另一个参数。我有一个这样的表:

NAME    LASTMODIFY                 COMPANY
John 2015-01-01 00:00:00.000 1
Mike 2016-02-02 01:23:00.000 2
Andy 2017-03-04 05:16:00.000 3
Annie 2018-09-08 00:00:00.000 1

有人知道如何将参数插入管道中以指定复制哪家公司以及不复制哪家公司?

有什么建议吗?先谢谢大家了!

最佳答案

不太清楚你在问什么,所以如果我没有捕获重点,我很抱歉,但是:

复制允许您使用存储过程来潜在地解决您的问题。看看这个例子:https://learn.microsoft.com/en-us/azure/data-factory/connector-sql-server#invoking-stored-procedure-for-sql-sink

它使用存储过程来合并,根据 JOIN 匹配执行 UPDATE 或 INSERT。它还允许传递参数。

因此,如果您尝试根据参数仅复制某些情况,则 MERGE 连接可能会有所帮助。

关于azure - ADF V2 - 基于表列参数化数据复制管道,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52382706/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com