gpt4 book ai didi

elasticsearch - Elastic Search Master 容灾

转载 作者:行者123 更新时间:2023-12-03 00:09:48 24 4
gpt4 key购买 nike

我们有一个具有 5 个数据节点和 2 个主节点的 Elasticsearch 集群。一个主节点上的 Elasticsearch 服务始终处于禁用状态,因此始终只有一个主节点处于事件状态。今天由于某种原因,当前的主节点宕机了。我们在第二个主节点上启动了服务。所有连接到新主节点的数据节点,所有主分片都已成功分配,但所有副本都没有分配,我剩下将近 384 个未分配的分片。

我现在应该怎么做,分配他们?

在这种情况下必须执行的最佳实践和步骤是什么?

以下是我的http://es-master-node:9200/_settings看起来像:http://pastebin.com/mK1QBfP6

当我尝试手动分配分片时,出现以下错误:

➜  Desktop curl -XPOST http://localhost:9200/_cluster/reroute\?pretty -d '{
"commands": [
{
"allocate": {
"index": "logstash-1970.01.18",
"shard": 1,
"node": "node-name",
"allow_primary": true
}
}
]
}'
{
"error" : {
"root_cause" : [ {
"type" : "illegal_argument_exception",
"reason" : "[allocate] allocation of [logstash-1970.01.18][1] on node {node-name}{vrVG4CBbSvubWHOzn2qfQA}{10.100.0.146}{10.100.0.146:9300}{master=false} is not allowed, reason: [YES(allocation disabling is ignored)][NO(more than allowed [85.0%] used disk on node, free: [13.671127301258165%])][YES(shard not primary or relocation disabled)][YES(target node version [2.2.0] is same or newer than source node version [2.2.0])][YES(no allocation awareness enabled)][YES(shard is not allocated to same node or host)][YES(allocation disabling is ignored)][YES(below shard recovery limit of [2])][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(node passes include/exclude/require filters)][YES(primary is already active)]"
} ],
"type" : "illegal_argument_exception",
"reason" : "[allocate] allocation of [logstash-1970.01.18][1] on node {node-name}{vrVG4CBbSvubWHOzn2qfQA}{10.100.0.146}{10.100.0.146:9300}{master=false} is not allowed, reason: [YES(allocation disabling is ignored)][NO(more than allowed [85.0%] used disk on node, free: [13.671127301258165%])][YES(shard not primary or relocation disabled)][YES(target node version [2.2.0] is same or newer than source node version [2.2.0])][YES(no allocation awareness enabled)][YES(shard is not allocated to same node or host)][YES(allocation disabling is ignored)][YES(below shard recovery limit of [2])][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(node passes include/exclude/require filters)][YES(primary is already active)]"
},
"status" : 400
}

任何帮助将不胜感激。

最佳答案

因此,这是我为分配未分配的分片所做的事情:

生成 5 个新的 ES-DATA 服务器并等待它们加入集群。一旦他们进入集群,我就使用了以下脚本:

#!/bin/bash
array=(node1 node2 node3 node4 node5)
node_counter=0
length=${#array[@]}
IFS=$'\n'
for line in $(curl -s 'http://ip-adress:9200/_cat/shards'| fgrep UNASSIGNED); do
INDEX=$(echo $line | (awk '{print $1}'))
SHARD=$(echo $line | (awk '{print $2}'))
NODE=${array[$node_counter]}
echo $NODE
curl -XPOST 'http://IP-adress:9200/_cluster/reroute' -d '{
"commands": [
{
"allocate": {
"index": "'$INDEX'",
"shard": '$SHARD',
"node": "'$NODE'",
"allow_primary": true
}
}
]
}'
node_counter=$(((node_counter)%length +1))
done

将未分配的分片分配给新的数据节点。集群再次恢复大约需要 5 到 6 分钟。尽管这是 hack,但相关的答案会更有意义。

以下是未回答的问题:
  • 分片已经在旧节点上,为什么 ES-Master 没有意识到这一点?
  • 我们如何明确要求 ES-MASTER 扫描已经存在的数据节点并从中获取信息(关于它们的当前状态、它们拥有的副本、它们包含的分片等)
  • 关于elasticsearch - Elastic Search Master 容灾,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41831886/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com