gpt4 book ai didi

python-3.x - Elasticsearch 解析为对象,但发现嵌套值

转载 作者:行者123 更新时间:2023-12-02 23:08:28 26 4
gpt4 key购买 nike

我目前正在一个项目中,在其中将以前处理中的数据存储在csv中,我想尝试一下ElasticSearch + Kibana分析我的数据*。问题是我有一列带有json值和一些我使用嵌套类型发送的None值。为了清理None我用'null'代替了它,但是出现了以下错误:

Tried to parse field as object but found a concrete value
我认为ES不喜欢可能具有'NULL'或嵌套类型的字段。我该如何解决这个问题并保持null值的原则以便以后进行过滤?谢谢您的帮助 :)
我正在使用python和 eland module处理将 Pandas 数据帧发送到ES。
ES version:
'version': {'number': '7.7.0',
'build_flavor': 'default',
'build_type': 'deb',
'build_hash': '81a1e9eda8e6183f5237786246f6dced26a10eaf',
'build_date': '2020-05-12T02:01:37.602180Z',
'build_snapshot': False,
'lucene_version': '8.5.1',
'minimum_wire_compatibility_version': '6.8.0',
'minimum_index_compatibility_version': '6.0.0-beta1'},
'tagline': 'You Know, for Search'}
编辑
我正在使用下面的代码提取(python3)发送数据,由于@Gibbs的回答,该代码现在可以正常工作

INDEX_NAME = 'my_index'
DATA_PATH = './data4analysis.csv'
def csv_jsonconverter_todict(field):
if not field:
return {'null_value': 'NULL'}
if "'" in field: # cleaning if bad json column, ok for me
field = field.replace("'", '"')
try:
return json.loads(field)
except Exception as e:
logger.exception('json.loads(field) failed on field= %s', field, exc_info=True)
raise e


def loadNprepare_data(path, sep=';'):
df = pd.read_csv(path, sep=sep, encoding='cp1252',
converters={'ffprobe': csv_jsonconverter_todict)

# cleaning NaNs to avoid " json_parse_exception Non-standard token 'NaN'"
df = df.applymap(lambda cell: 'null_value' if pd.isna(cell) or not cell else cell)
return df

if __name__ == '__main__':
es_client = Elasticsearch(hosts=[ES_HOST], http_compress=True)

if es_client.indices.exists(INDEX_NAME):
logger.info(f"deleting '{INDEX_NAME}' index...")
res = es_client.indices.delete(index=INDEX_NAME)
logger.info(f"response: '{res}'")

# since we are running locally, use one shard and no replicas
request_body = {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}
logger.info(f"creating '{INDEX_NAME}' index...")
res = es_client.indices.create(index=INDEX_NAME, body=request_body)
logger.info(f" response: '{res}'")

logger.info("Sending data to ES")

data = loadNprepare_data(DATA_PATH)
try:
el_df = eland.pandas_to_eland(data, es_client,
es_dest_index=INDEX_NAME,
es_if_exists='replace',
es_type_overrides= {'ffprobe': 'nested'})
except Exception as e:
logger.error('Elsatic Search error', exc_info=True)
raise e

最佳答案

问题是您为列定义了type。您正在尝试在该列中插入字符串'null'
不支持两种不同的类型。如果您按照here的说明进行操作,它将接受Null值

A null value cannot be indexed or searched. When a field is set to null, (or an empty array or an array of null values) it is treated as though that field has no values.



The null_value parameter allows you to replace explicit null values with the specified value so that it can be indexed and searched

关于python-3.x - Elasticsearch 解析为对象,但发现嵌套值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62607672/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com