gpt4 book ai didi

elasticsearch - 为什么Elasticsearch快照统计数字number_of_files与实际索引数不同?

转载 作者:行者123 更新时间:2023-12-03 01:56:35 24 4
gpt4 key购买 nike

我有一个名为traces_v2的索引,别名为traces,带有5M文档。

我做了GET /_snapshot/s3_repository/snapshot_traces_250316/_status,两分钟后状态为:

{
"snapshots": [
{
"snapshot": "snapshot_traces_250316",
"repository": "s3_repository",
"state": "SUCCESS",
"shards_stats": {
"initializing": 0,
"started": 0,
"finalizing": 0,
"done": 8,
"failed": 0,
"total": 8
},
"stats": {
"number_of_files": 185,
"processed_files": 185,
"total_size_in_bytes": 654459334,
"processed_size_in_bytes": 654459334,
"start_time_in_millis": 1458898771760,
"time_in_millis": 81226
},
"indices": {
"aliases": {
"shards_stats": {
"initializing": 0,
"started": 0,
"finalizing": 0,
"done": 5,
"failed": 0,
"total": 5
},
"stats": {
"number_of_files": 5,
"processed_files": 5,
"total_size_in_bytes": 795,
"processed_size_in_bytes": 795,
"start_time_in_millis": 1458898819263,
"time_in_millis": 1491
},
"shards": {
"0": {
"stage": "DONE",
"stats": {
"number_of_files": 1,
"processed_files": 1,
"total_size_in_bytes": 159,
"processed_size_in_bytes": 159,
"start_time_in_millis": 1458898820308,
"time_in_millis": 110
}
},
"1": {
"stage": "DONE",
"stats": {
"number_of_files": 1,
"processed_files": 1,
"total_size_in_bytes": 159,
"processed_size_in_bytes": 159,
"start_time_in_millis": 1458898820674,
"time_in_millis": 80
}
},
"2": {
"stage": "DONE",
"stats": {
"number_of_files": 1,
"processed_files": 1,
"total_size_in_bytes": 159,
"processed_size_in_bytes": 159,
"start_time_in_millis": 1458898819263,
"time_in_millis": 101
}
},
"3": {
"stage": "DONE",
"stats": {
"number_of_files": 1,
"processed_files": 1,
"total_size_in_bytes": 159,
"processed_size_in_bytes": 159,
"start_time_in_millis": 1458898819617,
"time_in_millis": 108
}
},
"4": {
"stage": "DONE",
"stats": {
"number_of_files": 1,
"processed_files": 1,
"total_size_in_bytes": 159,
"processed_size_in_bytes": 159,
"start_time_in_millis": 1458898819916,
"time_in_millis": 86
}
}
}
},
"traces_v2": {
"shards_stats": {
"initializing": 0,
"started": 0,
"finalizing": 0,
"done": 3,
"failed": 0,
"total": 3
},
"stats": {
"number_of_files": 180,
"processed_files": 180,
"total_size_in_bytes": 654458539,
"processed_size_in_bytes": 654458539,
"start_time_in_millis": 1458898771760,
"time_in_millis": 81226
},
"shards": {
"0": {
"stage": "DONE",
"stats": {
"number_of_files": 58,
"processed_files": 58,
"total_size_in_bytes": 213816982,
"processed_size_in_bytes": 213816982,
"start_time_in_millis": 1458898814476,
"time_in_millis": 38510
}
},
"1": {
"stage": "DONE",
"stats": {
"number_of_files": 55,
"processed_files": 55,
"total_size_in_bytes": 253988996,
"processed_size_in_bytes": 253988996,
"start_time_in_millis": 1458898771760,
"time_in_millis": 47244
}
},
"2": {
"stage": "DONE",
"stats": {
"number_of_files": 67,
"processed_files": 67,
"total_size_in_bytes": 186652561,
"processed_size_in_bytes": 186652561,
"start_time_in_millis": 1458898771760,
"time_in_millis": 42340
}
}
}
}
}
}
]
}

状态为SUCCESS,但统计数据显示已快照了180个文档(超过5M!)。是那些真实的文档或某种内部包含数百万个文档的文件夹吗?

最佳答案

索引存储在磁盘上的物理文件中。number_of_files表示包含该数据的文件总数,而不是实际的文档数。

一个文件可能包含很多文档。例如,就您而言,每个文件可能平均包含5M / 180个文档,尽管不能保证每个文件都包含相同数量的文档。

如果您进一步研究细节,它包含分片明智文件的分解,即包含与一个分片相对应的数据的总数。

关于elasticsearch - 为什么Elasticsearch快照统计数字number_of_files与实际索引数不同?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36217877/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com