gpt4 book ai didi

elasticsearch - Elasticsearch中的聚合切割字符串而不是全部

转载 作者:行者123 更新时间:2023-12-03 01:52:13 26 4
gpt4 key购买 nike

具有以下简单映射:

curl -XPUT localhost:9200/transaciones/ -d '{
"mappings": {
"ventas": {
"properties": {
"tipo": { "type": "string" },
"cantidad": { "type": "double" }
}
}
}
}'

添加数据:
curl -XPUT localhost:9200/transaciones/ventas/1 -d '{
"tipo": "Ingreso bancario",
"cantidad": 80
}'

curl -XPUT localhost:9200/transaciones/ventas/2 -d '{
"tipo": "Ingreso bancario",
"cantidad": 10
}'

curl -XPUT localhost:9200/transaciones/ventas/3 -d '{
"tipo": "PayPal",
"cantidad": 30
}'

curl -XPUT localhost:9200/transaciones/ventas/4 -d '{
"tipo": "Tarjeta de credito",
"cantidad": 130
}'

curl -XPUT localhost:9200/transaciones/ventas/5 -d '{
"tipo": "Tarjeta de credito",
"cantidad": 130
}'

当我尝试通过以下方式获取 aggs时:
curl -XGET localhost:9200/transaciones/ventas/_search?pretty=true -d '{
"size": 0,
"aggs": {
"tipos_de_venta": {
"terms": {
"field": "tipo"
}
}
}
}'

响应为:
  "took" : 15,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 5,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"tipos_de_venta" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "bancario",
"doc_count" : 2
}, {
"key" : "credito",
"doc_count" : 2
}, {
"key" : "de",
"doc_count" : 2
}, {
"key" : "ingreso",
"doc_count" : 2
}, {
"key" : "tarjeta",
"doc_count" : 2
}, {
"key" : "paypal",
"doc_count" : 1
} ]
}
}
}

如您所见,它将字符串 Tarjeta de credito切成 Tarjetadecredit
如何在不使用 not_analyzed上的映射 tipo的情况下获取整个字符串?我想要的输出将是 Ingreso bancarioPayPalTarjeta de crédito,响应将是这样的:
 "aggregations" : {
"tipos_de_venta" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "Ingreso bancario",
"doc_count" : 2
}, {
"key" : "PayPal",
"doc_count" : 1
}, {
"key" : "Tarjeta de credito",
"doc_count" : 2
} ]
}
}

PS:我正在使用ES 2.3.2

最佳答案

这是因为您的tipo字段是经过分析的字符串。正确的方法是创建一个not_analyzed字段,以实现所需的目标:

curl -XPUT localhost:9200/transaciones/_mapping/ventas -d '{
"properties": {
"tipo": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}'

然后,您需要为文档重新索引,最后,您将能够运行此文件并获得所需的结果:
curl -XGET localhost:9200/transaciones/ventas/_search?pretty=true -d '{
"size": 0,
"aggs": {
"tipos_de_venta": {
"terms": {
"field": "tipo.raw"
}
}
}
}'

更新

如果您确实不想创建 not_analyzed字段,则可以使用另一种方法来使用 script terms聚合,但它确实可以 破坏群集的性能
curl -XGET localhost:9200/transaciones/ventas/_search?pretty=true -d '{
"size": 0,
"aggs": {
"tipos_de_venta": {
"terms": {
"script": _source.tipo"
}
}
}
}'

关于elasticsearch - Elasticsearch中的聚合切割字符串而不是全部,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39448480/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com