gpt4 book ai didi

json - 从 databricks 事件的 runOutput 参数中提取嵌套信息

转载 作者:行者123 更新时间:2023-12-03 02:00:55 24 4
gpt4 key购买 nike

我有一个包含 Databricks 事件的 ADF 管道。该事件的运行输出如下:

"runOutput": {
"index": {
"0": 0,
"1": 0,
"2": 1
},
"t0": {
"0": "2030-04-26 11:50:30.594",
"1": "2030-04-26 11:50:30.594",
"2": "2030-04-26 11:50:30.594"
},
"t1": {
"0": "2030-04-26 11:55:30.594",
"1": "2030-04-26 11:55:30.594",
"2": "2030-04-26 11:55:30.594"
},
"name": {
"0": "all",
"1": "Alt",
"2": "Ass"
},
"CA": {
"0": 6710065.65,
"1": 257580.69,
"2": 171109.65
},
"nb_ligne": {
"0": 170506,
"1": 6500403,
"2": 4142539
},
"nb_tickets": {
"0": 444766,
"1": 164764,
"2": 111471
},
"CAparPanier": {
"0": 145.04,
"1": 1745.48,
"2": 145.41
},
"CAparLigne": {
"0": 23.94,
"1": 35.96,
"2": 74.14
},
"ligneParPanier": {
"0": 23.82,
"1": 34.91,
"2": 32.73
}
}

我的目标是提取每个键的所有值(索引、t0、t1、名称、CA 等)并将每组值附加到数组中,(例如:CA=[6710065.65,257580.69,171109.65] )。

了解更多详情:我可以静态访问此输出的每个值,例如获取 CA 的第一个值,这是相应的表达式:

@activity('landing to raw mailing').output.runOutput.CA['0']

但我需要动态解决方案。

为了实现这一目标,我正在考虑两种情况:

  1. 第一个场景涉及使用 ForEach 事件循环遍历 runOutput 对象中的每个键。在此事件中,我将使用管道调用另一个 ForEach 事件来提取每个键的所有值,并使用 Append Variable 事件将每个值附加到数组中。
  2. 第二个场景涉及将 Databricks 事件的 runOutput 存储在 JSON 文件中。然后,我将解析 JSON 文件以提取所需的数据。

但是,我不确定如何实现这两个场景。第一个场景的示例 foreach 的表达式返回 1 个项目(我必须得到 10 个)。那么我怎样才能循环每个键呢?这是 foreach 表达式:

@array(activity('landing to raw mailing').output['runOutput'])

最佳答案

我可以使用您的第一种方法来实现您的要求,如下所示,该方法仅在这种情况下有效。

我从查找中获取了上面的 JSON。对你来说,这将是 databricks 笔记本 runOutout JSON。

  • 然后,我将 JSON 转换为字符串,并通过先跳过 { 来获取子字符串。 .

    enter image description here

  • 然后我对上面的字符串使用了 split "},"enter image description here

  • 将此数组提供给 ForEach,并在 ForEach 内部获取 key 并追加到数组中。

  • 之后,使用执行管道事件调用另一个管道,并使用 JSON 中的键作为数组参数将数组传递给子管道。 enter image description here

  • 在子管道中,创建一个数组变量,其值为 ["0","1","1"]并将其传递给 ForEach。

  • 在 ForEach 内部,使用上述键将数组(键数组)的值附加到数组变量中。 enter image description here

  • 在 ForEach 外部,使用设置变量返回将数组从子管道返回到父管道。

  • 现在,在父管道 ForEach 事件中,将数组与 key 连接起来以构建一个键值对,其中 key 是我们的键,value 将是它的数组。

连接后,将给出一个字符串,并在 ForEach 之外添加 {}到它。

结果:

enter image description here

ADF 中的变量当前不支持对象类型。因此,我在这里将输出显示为字符串类型。使用时,可以使用 json() 将其转换为对象功能。

您可以看到上面的键值对。您可以通过我们从第一个 ForEach 内的附加变量获得的键列表(将其赋予 ForEach)访问数组。

我的父管道 JSON:

{
"name": "pipeline1",
"properties": {
"activities": [
{
"name": "Lookup1",
"type": "Lookup",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "JsonSource",
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"recursive": true,
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "JsonReadSettings"
}
},
"dataset": {
"referenceName": "Jsongen2",
"type": "DatasetReference"
},
"firstRowOnly": false
}
},
{
"name": "convert json to string",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Lookup1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "jsonasstring",
"value": {
"value": "@substring(string(activity('Lookup1').output.value[0]),1,sub(length(string(activity('Lookup1').output.value[0])),1))",
"type": "Expression"
}
}
},
{
"name": "split string to array",
"type": "SetVariable",
"dependsOn": [
{
"activity": "convert json to string",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "splitarray",
"value": {
"value": "@split(variables('jsonasstring'),'},')",
"type": "Expression"
}
}
},
{
"name": "ForEach1",
"type": "ForEach",
"dependsOn": [
{
"activity": "split string to array",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@variables('splitarray')",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Append variable1",
"type": "AppendVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "keys",
"value": {
"value": "@replace(split(item(),':')[0],'\"','')",
"type": "Expression"
}
}
},
{
"name": "Execute Pipeline1",
"type": "ExecutePipeline",
"dependsOn": [
{
"activity": "Append variable1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"pipeline": {
"referenceName": "child",
"type": "PipelineReference"
},
"waitOnCompletion": true,
"parameters": {
"json": {
"value": "@activity('Lookup1').output.value[0][replace(split(item(),':')[0],'\"','')]",
"type": "Expression"
}
}
}
},
{
"name": "concat to temp",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Execute Pipeline1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "temp",
"value": {
"value": "@concat(variables('final_string'),'\"',replace(split(item(),':')[0],'\"',''),'\":',activity('Execute Pipeline1').output.pipelineReturnValue.myarr,',')",
"type": "Expression"
}
}
},
{
"name": "assign temp to final_str",
"type": "SetVariable",
"dependsOn": [
{
"activity": "concat to temp",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "final_string",
"value": {
"value": "@variables('temp')",
"type": "Expression"
}
}
}
]
}
},
{
"name": "Set variable1",
"type": "SetVariable",
"dependsOn": [
{
"activity": "ForEach1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "final_json",
"value": {
"value": "@concat('{',variables('final_string'),substring(variables('final_string'),0,sub(length(variables('final_string')),1)),'}')",
"type": "Expression"
}
}
}
],
"variables": {
"jsonasstring": {
"type": "String"
},
"splitarray": {
"type": "Array"
},
"keys": {
"type": "Array"
},
"final_string": {
"type": "String"
},
"temp": {
"type": "String"
},
"final_json": {
"type": "String"
}
},
"annotations": []
}
}

子管道 JSON:

{
"name": "child",
"properties": {
"activities": [
{
"name": "ForEach1",
"type": "ForEach",
"dependsOn": [
{
"activity": "Set variable1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@variables('numbers')",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Append variable1",
"type": "AppendVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "arr",
"value": {
"value": "@pipeline().parameters.json[item()]",
"type": "Expression"
}
}
}
]
}
},
{
"name": "Set variable1",
"type": "SetVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "numbers",
"value": [
"0",
"1",
"2"
]
}
},
{
"name": "return",
"type": "SetVariable",
"dependsOn": [
{
"activity": "ForEach1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "pipelineReturnValue",
"value": [
{
"key": "myarr",
"value": {
"type": "Expression",
"content": "@variables('arr')"
}
}
],
"setSystemVariable": true
}
}
],
"parameters": {
"json": {
"type": "object"
}
},
"variables": {
"numbers": {
"type": "Array"
},
"arr": {
"type": "Array"
},
"temp": {
"type": "String"
}
},
"annotations": []
}
}

关于json - 从 databricks 事件的 runOutput 参数中提取嵌套信息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76109943/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com