gpt4 book ai didi

azure - 使用 azure 数据工厂将数据从 azure 追加 blob 提取到 kusto 数据库时出错

转载 作者:行者123 更新时间:2023-12-03 06:17:58 26 4
gpt4 key购买 nike

我有一个 azure 的附加 blob(sharing.json),其内容类型为:application/json。我正在尝试使用 azure 数据工厂(ADF)将其摄取到 kusto 数据库中,但摄取总是失败。我在 ADF 的输出中收到以下错误:

"errors": [
{
"Code": 23302,
"Message": "ErrorCode=KustoWriteFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Write to Kusto failed with following error: 'An error occurred for source: 'DataReader'. Error: '''.,Source=Microsoft.DataTransfer.Runtime.KustoConnector,''Type=Kusto.Ingest.Exceptions.IngestClientException,Message=An error occurred for source: 'DataReader'. Error: '',Source=Kusto.Ingest,'",
"EventType": 0,
"Category": 5,
"Data": {},
"MsgId": null,
"ExceptionType": null,
"Source": null,
"StackTrace": null,
"InnerEventInfos": []
}
]

尝试从 chatGPT 和其他在线资源获取帮助,但到目前为止还没有成功。

这是我的 ADF 事件配置:

{
"name": "CopyPipeline_k0h",
"properties": {
"activities": [
{
"name": "Copy_k0h",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 3,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [
{
"name": "Source",
"value": "sil-xms-load-max-data//sharing.json"
},
{
"name": "Destination",
"value": "AggregatedSharingTest_v1"
}
],
"typeProperties": {
"source": {
"type": "JsonSource",
"storeSettings": {
"type": "AzureBlobStorageReadSettings",
"recursive": true,
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "JsonReadSettings"
}
},
"sink": {
"type": "AzureDataExplorerSink",
"ingestionMappingName": "",
"additionalProperties": {
"tags": "drop-by:loadtest",
"format": "multijson"
}
},
"enableStaging": false,
"validateDataConsistency": false,
"logSettings": {
"enableCopyActivityLog": true,
"copyActivityLogSettings": {
"logLevel": "Info",
"enableReliableLogging": true
},
"logLocationSettings": {
"linkedServiceName": {
"referenceName": "LoadTestBlob",
"type": "LinkedServiceReference"
},
"path": "debug-logs"
}
},
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "$['deviceId']"
},
"sink": {
"name": "deviceId",
"type": "String"
}
},
{
"source": {
"path": "$['tenant']"
},
"sink": {
"name": "tenant",
"type": "String"
}
},
{
"source": {
"path": "$['tagsSerialNo']"
},
"sink": {
"name": "tagsSerialNo",
"type": "String"
}
},
{
"source": {
"path": "$['metricSum']"
},
"sink": {
"name": "metricSum",
"type": "Int64"
}
},
{
"source": {
"path": "$['metricCount']"
},
"sink": {
"name": "metricCount",
"type": "Int64"
}
},
{
"source": {
"path": "$['notMetricCount']"
},
"sink": {
"name": "notMetricCount",
"type": "Int64"
}
},
{
"source": {
"path": "$['timestamp']"
},
"sink": {
"name": "timestamp",
"type": "DateTime"
}
}
],
"collectionReference": ""
}
},
"inputs": [
{
"referenceName": "SourceDataset_k0h",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DestinationDataset_k0h",
"type": "DatasetReference"
}
]
}
],
"annotations": [],
"lastPublishTime": "2023-04-18T11:30:35Z"
},
"type": "Microsoft.DataFactory/factories/pipelines"
}

这是 ADF 上的目标数据集配置:

{
"name": "DestinationDataset_k0h",
"properties": {
"linkedServiceName": {
"referenceName": "LoadTestDump",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "AzureDataExplorerTable",
"schema": [
{
"name": "deviceId",
"type": "string"
},
{
"name": "tenant",
"type": "string"
},
{
"name": "tagsSerialNo",
"type": "string"
},
{
"name": "metricSum",
"type": "long"
},
{
"name": "metricCount",
"type": "long"
},
{
"name": "notMetricCount",
"type": "long"
},
{
"name": "timestamp",
"type": "datetime"
}
],
"typeProperties": {
"table": "AggregatedSharingTest_v1"
}
},
"type": "Microsoft.DataFactory/factories/datasets"
}

这是 ADF 上的 Azure Blob 存储配置:

{
"name": "SourceDataset_k0h",
"properties": {
"linkedServiceName": {
"referenceName": "LoadTestBlob",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "Json",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"fileName": "sharing.json",
"container": "sil-xms-load-max-data"
}
},
"schema": {
"type": "object",
"properties": {
"deviceId": {
"type": "string"
},
"tenant": {
"type": "string"
},
"tagsSerialNo": {
"type": "string"
},
"metricSum": {
"type": "integer"
},
"metricCount": {
"type": "integer"
},
"notMetricCount": {
"type": "integer"
},
"timestamp": {
"type": "string"
}
}
}
},
"type": "Microsoft.DataFactory/factories/datasets"
}

我已经在 azure 门户上测试了源连接和目标连接,它们看起来不错。不确定到底出了什么问题,因为管道运行并且运行详细信息显示了读取的数据和写入的数据,但数据在 Kusto 表上永远无法用于查询,并最终因上述错误而失败

最佳答案

我尝试使用存储帐户中的输入 JSON 和管道 JSON,但最终出现相同的错误。

enter image description here

就您的情况而言,此错误的原因是复制事件接收器中的additionalProperties

当我删除additionalProperties后,我能够成功复制数据。

enter image description here

我在 kustos 表中有 4 行数据,在删除附加属性后,您可以看到使用复制事件从源插入了两行。

enter image description here

目标表中的数据:

enter image description here

这是我的 Pipeline JSON 供您引用:

{
"name": "pipeline2",
"properties": {
"activities": [
{
"name": "Copy data1",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [
{
"name": "Source",
"value": "data//myjson.json"
},
{
"name": "Destination",
"value": "table1"
}
],
"typeProperties": {
"source": {
"type": "JsonSource",
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"recursive": true,
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "JsonReadSettings"
}
},
"sink": {
"type": "AzureDataExplorerSink",
"ingestionMappingName": ""
},
"enableStaging": false,
"logSettings": {
"enableCopyActivityLog": true,
"copyActivityLogSettings": {
"logLevel": "Info",
"enableReliableLogging": true
},
"logLocationSettings": {
"linkedServiceName": {
"referenceName": "AzureDataLakeStorage2",
"type": "LinkedServiceReference"
},
"path": "data/debug-logs"
}
},
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "$['deviceId']"
},
"sink": {
"name": "deviceId",
"type": "String"
}
},
{
"source": {
"path": "$['tenant']"
},
"sink": {
"name": "tenant",
"type": "Guid"
}
},
{
"source": {
"path": "$['tagsSerialNo']"
},
"sink": {
"name": "tagsSerialNo",
"type": "String"
}
},
{
"source": {
"path": "$['metricSum']"
},
"sink": {
"name": "metricSum",
"type": "Int64"
}
},
{
"source": {
"path": "$['metricCount']"
},
"sink": {
"name": "metricCount",
"type": "Int64"
}
},
{
"source": {
"path": "$['notMetricCount']"
},
"sink": {
"name": "notMetricCount",
"type": "Int64"
}
},
{
"source": {
"path": "$['timestamp']"
},
"sink": {
"name": "timestamp",
"type": "DateTime"
}
}
],
"collectionReference": ""
}
},
"inputs": [
{
"referenceName": "Json1",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "AzureDataExplorerTable1",
"type": "DatasetReference"
}
]
}
],
"annotations": []
}
}

关于azure - 使用 azure 数据工厂将数据从 azure 追加 blob 提取到 kusto 数据库时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76063959/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com