gpt4 book ai didi

elasticsearch - Elasticsearch 重新索引后更大的索引大小

转载 作者:行者123 更新时间:2023-12-03 01:33:03 26 4
gpt4 key购买 nike

在对 75GB 索引执行重新索引后,新索引变为 79GB。

两个索引具有相同的文档计数 (54,123,676),并且都具有完全相同的映射。原索引有 6*2 分片,新索引有 3*2 分片。

原始索引还有 75,857 个未移动的已删除文档,因此我们很难理解它怎么会比新索引更小,更不用说整整 4GB。

原索引

{
"_shards": {
"total": 12,
"successful": 12,
"failed": 0
},
"_all": {
"primaries": {
"docs": {
"count": 54123676,
"deleted": 75857
},
"store": {
"size_in_bytes": 75357819717,
"throttle_time_in_millis": 0
},
...
"segments": {
"count": 6,
"memory_in_bytes": 173650124,
"terms_memory_in_bytes": 152493380,
"stored_fields_memory_in_bytes": 17914688,
"term_vectors_memory_in_bytes": 0,
"norms_memory_in_bytes": 79424,
"points_memory_in_bytes": 2728328,
"doc_values_memory_in_bytes": 434304,
"index_writer_memory_in_bytes": 0,
"version_map_memory_in_bytes": 0,
"fixed_bit_set_memory_in_bytes": 0,
"max_unsafe_auto_id_timestamp": -1,
"file_sizes": {}
}
...

新指数
{
"_shards": {
"total": 6,
"successful": 6,
"failed": 0
},
"_all": {
"primaries": {
"docs": {
"count": 54123676,
"deleted": 0
},
"store": {
"size_in_bytes": 79484557149,
"throttle_time_in_millis": 0
},
...
"segments": {
"count": 3,
"memory_in_bytes": 166728713,
"terms_memory_in_bytes": 145815659,
"stored_fields_memory_in_bytes": 17870464,
"term_vectors_memory_in_bytes": 0,
"norms_memory_in_bytes": 37696,
"points_memory_in_bytes": 2683802,
"doc_values_memory_in_bytes": 321092,
"index_writer_memory_in_bytes": 0,
"version_map_memory_in_bytes": 0,
"fixed_bit_set_memory_in_bytes": 0,
"max_unsafe_auto_id_timestamp": -1,
"file_sizes": {}
}
...

有什么线索吗?

最佳答案

您应该使用段合并功能。由于段是不可变的,ES 总是会创建新的段并慢慢地合并自己。但此请求将帮助您解决问题。它合并所有段并节省内存。但是当你发送这个请求时,请注意这个请求有点重。所以选择非高峰时间执行。
POST /_forcemerge?only_expunge_deletes=true

关于elasticsearch - Elasticsearch 重新索引后更大的索引大小,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54578435/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com