gpt4 book ai didi

azure - contentOffset 从哪里来?

转载 作者:行者123 更新时间:2023-12-03 07:00:07 25 4
gpt4 key购买 nike

我正在尝试了解 Azure 认知搜索中的技能组。我想构建一个 Ocr 支持的搜索,并尝试了解它是如何工作的。

例如documentationocr 技能 产生响应:

{
"text": "Hello World. -John",
"layoutText":
{
"language" : "en",
"text" : "Hello World. -John",
"lines" : [
{
"boundingBox":
[ {"x":10, "y":10}, {"x":50, "y":10}, {"x":50, "y":30},{"x":10, "y":30}],
"text":"Hello World."
},
{
"boundingBox": [ {"x":110, "y":10}, {"x":150, "y":10}, {"x":150, "y":30},{"x":110, "y":30}],
"text":"-John"
}
],
"words": [
{
"boundingBox": [ {"x":110, "y":10}, {"x":150, "y":10}, {"x":150, "y":30},{"x":110, "y":30}],
"text":"Hello"
},
{
"boundingBox": [ {"x":110, "y":10}, {"x":150, "y":10}, {"x":150, "y":30},{"x":110, "y":30}],
"text":"World."
},
{
"boundingBox": [ {"x":110, "y":10}, {"x":150, "y":10}, {"x":150, "y":30},{"x":110, "y":30}],
"text":"-John"
}
]
}
}

but then in this paragraph我们看到,仅使用了 OCR 技能中的 text 字段,并且呈现了新的 contentOffset 字段。

自定义技能组定义:

{
"description": "Extract text from images and merge with content text to produce merged_text",
"skills":
[
{
"description": "Extract text (plain and structured) from image.",
"@odata.type": "#Microsoft.Skills.Vision.OcrSkill",
"context": "/document/normalized_images/*",
"defaultLanguageCode": "en",
"detectOrientation": true,
"inputs": [
{
"name": "image",
"source": "/document/normalized_images/*"
}
],
"outputs": [
{
"name": "text"
}
]
},
{
"@odata.type": "#Microsoft.Skills.Text.MergeSkill",
"description": "Create merged_text, which includes all the textual representation of each image inserted at the right location in the content field.",
"context": "/document",
"insertPreTag": " ",
"insertPostTag": " ",
"inputs": [
{
"name":"text",
"source": "/document/content"
},
{
"name": "itemsToInsert",
"source": "/document/normalized_images/*/text"
},
{
"name":"offsets",
"source": "/document/normalized_images/*/contentOffset"
}
],
"outputs": [
{
"name": "mergedText",
"targetName" : "merged_text"
}
]
}
]
}

输入应如下所示:

{
"values": [
{
"recordId": "1",
"data":
{
"text": "The brown fox jumps over the dog",
"itemsToInsert": ["quick", "lazy"],
"offsets": [3, 28]
}
}
]
}

那么 offsets 数组(技能定义中的 contentOffset)是如何来自 OcrSkill 响应不返回该值并且 Read 计算机视觉方法没有从 API 中返回该方法?

最佳答案

contentOffset - 是从嵌入图像的文件中提取内容的默认功能。因此,只要 OCR 技能组识别出输入文档中包含的图像,就会调用 contentOffset

要回答出现 contentOffset 数组的原因,是因为我们上传用于分析的每个输入中都有多个图像。请考虑 ReadAPI through REST 的以下文档遵循 JSON 操作。

关于azure - contentOffset 从哪里来?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72532077/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com