gpt4 book ai didi

druid - 无法通过 API 创建 Druid 摄取任务

转载 作者:行者123 更新时间:2023-12-01 23:27:05 28 4
gpt4 key购买 nike

当我将 JSON 摄取规范发送到 Druid 霸主 API 时,我得到以下响应:

HTTP/1.1 400 Bad Request
Content-Type: application/json
Date: Wed, 25 Sep 2019 11:44:18 GMT
Server: Jetty(9.4.10.v20180503)
Transfer-Encoding: chunked

{
"error": "Instantiation of [simple type, class org.apache.druid.indexing.common.task.IndexTask] value failed: null"
}

如果我改变 index任务类型到 index_parallel ,然后我得到这个:
{
"error": "Instantiation of [simple type, class org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask] value failed: null"
}

通过 Druid 的 Web UI 使用相同的摄取规范可以正常工作。

这是我使用的摄取规范(稍微修改以隐藏敏感数据):
{
"type": "index_parallel",
"dataSchema": {
"dataSource": "daily_xport_test",
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "MONTH",
"queryGranularity": "NONE",
"rollup": false
},
"parser": {
"type": "string",
"parseSpec": {
"format": "json",
"timestampSpec": {
"column": "dateday",
"format": "auto"
},
"dimensionsSpec": {
"dimensions": [
{
"type": "string",
"name": "id",
"createBitmapIndex": true
},
{
"type": "long",
"name": "clicks_count_total"
},
{
"type": "long",
"name": "ctr"
},
"deleted",
"device_type",
"target_url"
]
}
}
}
},
"ioConfig": {
"type": "index_parallel",
"firehose": {
"type": "static-google-blobstore",
"blobs": [
{
"bucket": "data-test",
"path": "/sample_data/daily_export_18092019/000000000000.json.gz"
}
],
"filter": "*.json.gz$"
},
"appendToExisting": false
},
"tuningConfig": {
"type": "index_parallel",
"maxNumSubTasks": 1,
"maxRowsInMemory": 1000000,
"pushTimeout": 0,
"maxRetry": 3,
"taskStatusCheckPeriodMs": 1000,
"chatHandlerTimeout": "PT10S",
"chatHandlerNumRetries": 5
}
}


Overlord API URI 如下所示:
http://host:8081/druid/indexer/v1/task

发送 API 请求的 HTTPie 命令:

http --print=Hhb  POST http://host:8081/druid/indexer/v1/task < test_spec.json

另外,如果我尝试在 Airflow 中使用 DruidHook 类发送摄取任务,我会遇到同样的问题

最佳答案

我找到了解决方案。显然,Druid UI 生成的规范与 API 使用的 JSON 格式略有不同。规范中的高级对象(“ioConfig”、“dataSchema”和“tuningConfig”)应包含在 spec 中对象,像这样:

{
"type": "index_parallel",
"spec": {
"dataSchema": {
"dataSource": "daily_xport_test",
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "MONTH",
"queryGranularity": "NONE",
"rollup": false
},
"parser": {
"type": "string",
"parseSpec": {
"format": "json",
"timestampSpec": {
"column": "dateday",
"format": "auto"
},
"dimensionsSpec": {
"dimensions": [{
"type": "string",
"name": "id",
"createBitmapIndex": true
},
{
"type": "long",
"name": "clicks_count_total"
},
{
"type": "long",
"name": "ctr"
},
"deleted",
"device_type",
"target_url"
]
}
}
}
},
"ioConfig": {
"type": "index_parallel",
"firehose": {
"type": "static-google-blobstore",
"blobs": [{
"bucket": "data-test",
"path": "/sample_data/daily_export_18092019/000000000000.json.gz"
}],
"filter": "*.json.gz$"
},
"appendToExisting": false
},
"tuningConfig": {
"type": "index_parallel",
"maxNumSubTasks": 1,
"maxRowsInMemory": 1000000,
"pushTimeout": 0,
"maxRetry": 3,
"taskStatusCheckPeriodMs": 1000,
"chatHandlerTimeout": "PT10S",
"chatHandlerNumRetries": 5
}
}
}

关于druid - 无法通过 API 创建 Druid 摄取任务,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58097806/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com