gpt4 book ai didi

azure - 将列转换为Azure数据流中另一列的JSON对象

转载 作者:行者123 更新时间:2023-12-03 06:16:17 24 4
gpt4 key购买 nike

我有以下格式的数据,我正在使用数据流以 JSON 格式格式化记录并将其存储到数据的另一列中。

Input

想要使用数据流转换为以下格式:

Output Format Required

我没有任何方法使用数据流来转换它

最佳答案

  • 您可以使用派生列转换来实现此目的。我将以下内容作为我的来源。

enter image description here

  • 现在,使用 associate 函数分别创建键值对,以使用派生列转换创建 2 个新列。
A: associate(CUST_ID_A,{SCORE A})
B: associate(CUST_ID_B,{SCORE B})

enter image description here

  • 现在,使用新创建的 so 列创建一个数组,如 array(A,B)

enter image description here

  • 现在,在接收器中,我选择一个 JSON 接收器文件并仅映射所需的列,如下所示:

enter image description here

  • 这将提供最终数据预览,如下图所示,这是要求的。

enter image description here

  • 以下是完整的数据流 JSON:
{
"name": "dataflow1",
"properties": {
"type": "MappingDataFlow",
"typeProperties": {
"sources": [
{
"dataset": {
"referenceName": "DelimitedText1",
"type": "DatasetReference"
},
"name": "source1"
}
],
"sinks": [
{
"dataset": {
"referenceName": "Json1",
"type": "DatasetReference"
},
"name": "sink1"
}
],
"transformations": [
{
"name": "derivedColumn1"
},
{
"name": "derivedColumn2"
}
],
"scriptLines": [
"source(output(",
" TRANS_ID as string,",
" CUST_ID_A as string,",
" {SCORE A} as string,",
" CUST_ID_B as string,",
" {SCORE B} as string",
" ),",
" allowSchemaDrift: true,",
" validateSchema: false,",
" ignoreNoFilesFound: false) ~> source1",
"source1 derive(A = associate(CUST_ID_A,{SCORE A}),",
" B = associate(CUST_ID_B,{SCORE B})) ~> derivedColumn1",
"derivedColumn1 derive(cust_conf = array(A,B)) ~> derivedColumn2",
"derivedColumn2 sink(allowSchemaDrift: true,",
" validateSchema: false,",
" partitionFileNames:['op.json'],",
" umask: 0022,",
" preCommands: [],",
" postCommands: [],",
" skipDuplicateMapInputs: true,",
" skipDuplicateMapOutputs: true,",
" mapColumn(",
" TRANS_ID,",
" cust_conf",
" ),",
" partitionBy('hash', 1)) ~> sink1"
]
}
}
}

关于azure - 将列转换为Azure数据流中另一列的JSON对象,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76224456/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com