gpt4 book ai didi

elasticsearch - 包含带有cURL的新行的批量索引文本字段

转载 作者:行者123 更新时间:2023-12-03 01:25:53 25 4
gpt4 key购买 nike

我正在尝试将具有以下格式的文件批量索引到我的elasticsearch索引中:

{"index":{"_index":"articles","_type":"_doc"}}
{"title":"My Article Title","text":"My article text. \nNext paragraph here."}

使用此命令:
curl -s -XPOST -H 'Content-Type: application/x-ndjson'  http://localhost:9200/_bulk --data-binary @/data.json

问题是我文档中的文章文本可能包含换行符 \n,这破坏了cURL批量索引的格式,因此出现此错误:
{"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}

我已经能够使用javascript API对这些文档进行批量索引,因此我希望可以使用cURL,因为我希望将这些文档索引到docker镜像中,作为构建的一部分。

最佳答案

我已经设法在Elasticsearch 7.3Red Hat Enterprise Linux 7 (7.7)上做到了。

1)将.json更改为.txt,然后在最后一行之后按Enter,将其保存并上传到服务器上
[root@host tmp]$ mv data.json data.txt
2)强制 curl 将新行追加到输出

[root@host tmp]$ echo '-w "\n"' >> ~/.curlrc

3)屈服于ES:
[root@host tmp]$ curl -s -XPOST -H 'Content-Type: application/x-ndjson'  https://localhost:9200/_bulk -k -u user:pass --data-binary @data.json
{"took":4,"errors":false,"items":[{"index":{"_index":"articles","_type":"_doc","_id":"QdsosG0B3nqkAGly3E6t","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":1,"status":201}}]}

4)结果:
[root@host tmp]$ curl -XGET -H 'Content-Type: application/x-ndjson'  https://localhost:9200/articles/_search?pretty -k -u user:pass
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "articles",
"_type" : "_doc",
"_id" : "QdsosG0B3nqkAGly3E6t",
"_score" : 1.0,
"_source" : {
"title" : "My Article Title",
"text" : "My article text. \nNext paragraph here."
}
}
]
}
}

关于elasticsearch - 包含带有cURL的新行的批量索引文本字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58300137/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com