gpt4 book ai didi

azure - 如何在 Azure 数据工厂中使用此 Rest API

转载 作者:行者123 更新时间:2023-12-03 02:49:20 25 4
gpt4 key购买 nike

我有一个 REST API,需要从 Azure 数据工厂调用并将数据插入 SQL 表中。

API返回的JSON格式如下:

{
"serviceResponse": {
"supportOffice": "EUKO",
"totalPages": 5,
"pageNo": 1,
"recordsPerPage": 1000,
"projects": [
{ "projectID":1 ...} , { "projectID":2 ...} ,...
]
}
}

URL 的格式为 http://server.com/api/Projects?pageNo=1

我已经成功设置了一个 RestService 来调用 API 并返回 JSON 和一个 SQL 接收器,该接收器将获取 JSON 并将其传递给存储过程,然后存储数据。

但是,我正在努力解决的是如何处理分页。

我已经尝试过:

  1. RestService 上的分页选项:我认为这不起作用,因为它只允许 XPATH 返回完整的下一个 URL。我看不出它允许根据totalPages 和pageNo 计算URL。 (或者我无法让它工作)

  2. 我尝试在处理之前添加对 API 的 Web 调用,然后计算页数。虽然不理想,但它确实有效,直到我达到 1mb/1 分钟的限制,因为一些响应相当大。这是行不通的。

  3. 我尝试查看 API 是否可以更改,但这是不可能的。

我想知道是否有人对如何实现此功能有任何想法,或者是否已成功使用类似的 API?

最佳答案

以下说明将逐步创建如下所示的管道。请注意,它使用存储过程事件、Web 事件和 For Each 事件。 pipeline screenshot

for each activities screenshot

首先配置 Azure SQL 数据库,设置 AAD 管理员,然后按照描述在数据库中授予 ADF MSI 权限 here 。然后创建下表和两个存储过程:

CREATE TABLE [dbo].[People](
[id] [int] NULL,
[email] [varchar](255) NULL,
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL,
[avatar] [nvarchar](1000) NULL
)

GO
/*
sample call:
exec uspInsertPeople @json = '{"page":1,"per_page":3,"total":12,"total_pages":4,"data":[{"id":1,"email":"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="6d0a08021f0a08430f011819052d1f081c1f081e430403" rel="noreferrer noopener nofollow">[email protected]</a>","first_name":"George","last_name":"Bluth","avatar":"https://s3.amazonaws.com/uifaces/faces/twitter/calebogden/128.jpg"},{"id":2,"email":"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0369626d66772d746662756671437166727166702d6a6d" rel="noreferrer noopener nofollow">[email protected]</a>","first_name":"Janet","last_name":"Weaver","avatar":"https://s3.amazonaws.com/uifaces/faces/twitter/josephstein/128.jpg"},{"id":3,"email":"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="05606868642b726a6b62457760747760762b6c6b" rel="noreferrer noopener nofollow">[email protected]</a>","first_name":"Emma","last_name":"Wong","avatar":"https://s3.amazonaws.com/uifaces/faces/twitter/olegpogodaev/128.jpg"}]}'
*/
create proc uspInsertPeople @json nvarchar(max)
as
begin
insert into People (id, email, first_name, last_name, avatar)
select d.*
from OPENJSON(@json)
WITH (
[data] nvarchar(max) '$.data' as JSON
)
CROSS APPLY OPENJSON([data], '$')
WITH (
id int '$.id',
email varchar(255) '$.email',
first_name varchar(100) '$.first_name',
last_name varchar(100) '$.last_name',
avatar nvarchar(1000) '$.avatar'
) d;
end

GO

create proc uspTruncatePeople
as
truncate table People


接下来,在 Azure 数据工厂 v2 中创建一个新管道,将其重命名为 ForEachPage,然后转到代码 View 并粘贴以下 JSON:

{
"name": "ForEachPage",
"properties": {
"activities": [
{
"name": "GetTotalPages",
"type": "WebActivity",
"dependsOn": [
{
"activity": "Truncate SQL Table",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"url": {
"value": "https://reqres.in/api/users?page=1",
"type": "Expression"
},
"method": "GET"
}
},
{
"name": "ForEachPage",
"type": "ForEach",
"dependsOn": [
{
"activity": "GetTotalPages",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@range(1,activity('GetTotalPages').output.total_pages)",
"type": "Expression"
},
"activities": [
{
"name": "GetPage",
"type": "WebActivity",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"url": {
"value": "@concat('https://reqres.in/api/users?page=',item())",
"type": "Expression"
},
"method": "GET"
}
},
{
"name": "uspInsertPeople stored procedure",
"type": "SqlServerStoredProcedure",
"dependsOn": [
{
"activity": "GetPage",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"storedProcedureName": "[dbo].[uspInsertPeople]",
"storedProcedureParameters": {
"json": {
"value": {
"value": "@string(activity('GetPage').output)",
"type": "Expression"
},
"type": "String"
}
}
},
"linkedServiceName": {
"referenceName": "lsAzureDB",
"type": "LinkedServiceReference"
}
}
]
}
},
{
"name": "Truncate SQL Table",
"type": "SqlServerStoredProcedure",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"storedProcedureName": "[dbo].[uspTruncatePeople]"
},
"linkedServiceName": {
"referenceName": "lsAzureDB",
"type": "LinkedServiceReference"
}
}
],
"annotations": []
}
}

创建到 Azure SQL DB 的 lsAzureDB 链接服务,将其设置为使用 MSI 进行身份验证。

该管道调用 sample paged API (目前有效,但它不是我管理的 API,因此可能会在某个时候停止工作)演示如何循环以及如何获取 Web 事件的结果并通过存储过程调用和 JSON 将它们插入到 SQL 表中在存储过程中进行解析。该循环将以并行方式运行,但您当然可以更改 ForEachPage 事件的设置以使其串行运行。

关于azure - 如何在 Azure 数据工厂中使用此 Rest API,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56949173/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com