python - 使用python在elasticsearch-dsl中聚合一个字段-6ren

python - 使用python在elasticsearch-dsl中聚合一个字段

转载作者：太空狗更新时间：2023-10-29 17:39:20

谁能告诉我如何编写 Python 语句来聚合(求和和计数)关于我的文档的内容？

脚本

from datetime import datetime
from elasticsearch_dsl import DocType, String, Date, Integer
from elasticsearch_dsl.connections import connections

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, Q

# Define a default Elasticsearch client
client = connections.create_connection(hosts=['http://blahblahblah:9200'])

s = Search(using=client, index="attendance")
s = s.execute()

for tag in s.aggregations.per_tag.buckets:
    print (tag.key)

输出

File "/Library/Python/2.7/site-packages/elasticsearch_dsl/utils.py", line 106, in __getattr__
'%r object has no attribute %r' % (self.__class__.__name__, attr_name))
AttributeError: 'Response' object has no attribute 'aggregations'

这是什么原因造成的？ “聚合”关键字错误吗？我需要导入其他包吗？如果“attendance”索引中的文档有一个名为 emailAddress 的字段，我将如何计算哪些文档具有该字段的值？

最佳答案

首先。我现在注意到我在这里写的实际上没有定义聚合。关于如何使用它的文档对我来说不是很易读。使用我上面写的，我会扩展。我正在更改索引名称以提供更好的示例。

from datetime import datetime
from elasticsearch_dsl import DocType, String, Date, Integer
from elasticsearch_dsl.connections import connections

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, Q

# Define a default Elasticsearch client
client = connections.create_connection(hosts=['http://blahblahblah:9200'])

s = Search(using=client, index="airbnb", doc_type="sleep_overs")
s = s.execute()

# invalid! You haven't defined an aggregation.
#for tag in s.aggregations.per_tag.buckets:
#    print (tag.key)

# Lets make an aggregation
# 'by_house' is a name you choose, 'terms' is a keyword for the type of aggregator
# 'field' is also a keyword, and 'house_number' is a field in our ES index
s.aggs.bucket('by_house', 'terms', field='house_number', size=0)

上面我们为每个门牌号创建了 1 个桶。因此，桶的名称将是门牌号。 ElasticSearch (ES) 将始终给出适合该存储桶的文档的文档计数。 Size=0 表示使用所有结果，因为 ES 的默认设置是仅返回 10 个结果(或者您的开发人员将其设置为执行的任何操作)。

# This runs the query.
s = s.execute()

# let's see what's in our results

print s.aggregations.by_house.doc_count
print s.hits.total
print s.aggregations.by_house.buckets

for item in s.aggregations.by_house.buckets:
    print item.doc_count

我之前的错误是认为 Elastic Search 查询默认具有聚合。您可以自己定义它们，然后执行它们。然后您的回复可以拆分为您提到的聚合器。

上面的 CURL 应该是这样的:
注意:我使用 SENSE 一个适用于 Google Chrome 的 ElasticSearch 插件/扩展/附加组件。在 SENSE 中，您可以使用//来注释掉内容。

POST /airbnb/sleep_overs/_search
{
// the size 0 here actually means to not return any hits, just the aggregation part of the result
    "size": 0,
    "aggs": {
        "by_house": {
            "terms": {
// the size 0 here means to return all results, not just the the default 10 results
                "field": "house_number",
                "size": 0
            }
        }
    }
}

解决方法。 DSL的GIT上有人告诉我忘记翻译，就用这个方法。它更简单，你可以用 CURL 编写困难的东西。这就是我将其称为解决方法的原因。

# Define a default Elasticsearch client
client = connections.create_connection(hosts=['http://blahblahblah:9200'])
s = Search(using=client, index="airbnb", doc_type="sleep_overs")

# how simple we just past CURL code here
body = {
    "size": 0,
    "aggs": {
        "by_house": {
            "terms": {
                "field": "house_number",
                "size": 0
            }
        }
    }
}

s = Search.from_dict(body)
s = s.index("airbnb")
s = s.doc_type("sleepovers")
body = s.to_dict()

t = s.execute()

for item in t.aggregations.by_house.buckets:
# item.key will the house number
    print item.key, item.doc_count

希望这对您有所帮助。我现在在 CURL 中设计所有内容，然后使用 Python 语句剥离结果以获得我想要的东西。这有助于具有多个级别的聚合(子聚合)。

关于python - 使用python在elasticsearch-dsl中聚合一个字段，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/29380198/

文章推荐： python - pyplot 在图例中组合多个线标签

文章推荐： C# - 泛型方法与非泛型方法

文章推荐： c# - 我无法获得我的 ANID？

c# - 2 合 1 Visual Studio ？
我以前使用过像 Netbeans 和 eclipse 这样的 IDE。我在 friend 的电脑上下载了“Visual Studio Express 2013 for windows desktop
c - 将 PSRAM 写入 EZ Flash 3 合 1
我正在尝试弄清楚如何在 GBA 大小的 EZ Flash 3 合 1 卡中对 PSRAM 进行编程。基本上重复 GBA Exploader 和其他程序所做的事情。如果我选择一个 block 并对其进
python - 如何组合所有 3 合 1 re.findall() ??(python 2.7 && 正则表达式)
Filter1=re.findall(r'',PageSource) Filter2=re.findall(r'',PageSource) Filter3=re.findall(r'(.*?).*?'
ubuntu - 戴尔 XPS 13 9365 2 合 1 挂起挂起 Ubuntu 16.04
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。我们不允许在 Stack Overflow 上提出有关通用计算硬件和软件的问题。您可以编辑问题，使其成为

太空狗

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 使用python在elasticsearch-dsl中聚合一个字段