gpt4 book ai didi

python - python从elasticsearch结果创建数据框

转载 作者:行者123 更新时间:2023-12-02 22:47:29 27 4
gpt4 key购买 nike

我有来自Elasticsearch的查询结果,格式如下:

[

{
"_index": "product",
"_type": "_doc",
"_id": "23234sdf",
"_score": 2.2295187,
"_source": {
"SERP_KEY": "",
"r_variant_info": "",
"s_asin": "",
"pid": "394",
"r_gtin": "00838128000547",
"additional_attributes_remarks": "publisher:0|size:0",
"s_gtin": "",
"r_category": "",
"confidence_score": "2.4545",
"title_match": "45.45"
}
},
{
"_index": "product",
"_type": "_doc",
"_id": "23234sdf",
"_score": 2.2295187,
"_source": {
"SERP_KEY": "",
"r_variant_info": "",
"s_asin": "",
"pid": "394",
"r_gtin": "00838128000547",
"additional_attributes_remarks": "publisher:0|size:0",
"s_gtin": "",
"r_category": "",
"confidence_score": "2.4545",
"title_match": "45.45"
}
},

]

我正在尝试将 _source字段与 _id一起加载到数据帧中。

我尝试了这个:
def fetch_records_from_elasticsearch_index(index, filter_json):
search_param = prepare_es_body(filter_json_dict=filter_json)
response = settings.ES.search(index=index, body=search_param, size=10)

if len(response['hits']['hits']) > 0:
import pandas as pd

all_hits = response['hits']['hits']
# return all_hits
# export es hits to pandas dataframe
df = pd.concat(map(pd.DataFrame.from_dict, all_hits), axis=1)['_source'].T

return df
else:
return 0
df仅包含 _source字段,但我也想向其中添加 _id字段。

这是df输出格式:
{

"AdminEdit": [
"False",
"False",
"False",
"False",
],
"Group": [
"Grp2",
"Grp2",
"Grp2",
"Grp2"
],

}

如何添加 _id

最佳答案

有两种方法可以解决此问题:

  • 直接代码
    import pandas as pd
    df = pd.json_normalize(all_hits)
  • 代码改进
    import json
    import pandas as pd
    df = pd.concat(map(pd.DataFrame.from_dict, all_hits), axis=1)['_source'].T
    df["_id"] = [i["_id"] for i in all_hits]

  • 使用的JSON是:
    all_hits = [

    {
    "_index": "product",
    "_type": "_doc",
    "_id": "23234sdg",
    "_score": 2.2295187,
    "_source": {
    "SERP_KEY": "",
    "r_variant_info": "",
    "s_asin": "",
    "pid": "394",
    "r_gtin": "00838128000547",
    "additional_attributes_remarks": "publisher:0|size:0",
    "s_gtin": "",
    "r_category": "",
    "confidence_score": "2.4545",
    "title_match": "45.45"
    }
    },
    {
    "_index": "product",
    "_type": "_doc",
    "_id": "23234sdf",
    "_score": 2.2295187,
    "_source": {
    "SERP_KEY": "",
    "r_variant_info": "",
    "s_asin": "",
    "pid": "394",
    "r_gtin": "00838128000547",
    "additional_attributes_remarks": "publisher:0|size:0",
    "s_gtin": "",
    "r_category": "",
    "confidence_score": "2.4545",
    "title_match": "45.45"
    }
    },

    ]

    关于python - python从elasticsearch结果创建数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62018576/

    27 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com