gpt4 book ai didi

erlang - erlang 集群中的多个 nodedown 消息

转载 作者:行者123 更新时间:2023-12-02 22:22:05 26 4
gpt4 key购买 nike

我正在构建一个简单的 gen_server 模块来监视多个远程节点的事件

当远程节点注册时,此模块使用 erlang:monitor_node(Node, true) 监视该节点。每个节点只注册一次(通过日志确认)

并且在 gen_server 的 handle_info/2 回调中,它捕获 {nodedown, Node} 消息并使用 erlang:monitor_node(Node, false) demonitors 节点。我希望只收到一次此消息:当远程节点关闭时。

我在测试模块时发现,当远程节点宕机时,会向 gen_server 发送数百条 {nodedown, Node} 消息(数量从几百到几千不等)。

为什么monitor_node发送了多条消息?我怎样才能防止这种行为?

编辑:这是(部分)源代码

register_node(#node_info{node = NodeName} = NodeInfo) ->
case mnesia:read(node_info, NodeName) of
[] ->
monitor_node(NodeName, true),
error_logger:info_msg("node ~p registered", [NodeName]);
[_OldInfo] ->
error_logger:trace_msg("info of node ~p updated", [NodeName])
end,
mnesia:write(NodeInfo).

handle_cast({register_node, #node_info{} = NodeStatus}, Timer) ->
case mnesia:transaction(fun register_node/1, [NodeStatus]) of
{aborted, Reason} ->
error_logger:warning_msg("transaction register_node failed: ~p", [Reason]);
_ ->
ok
end,
{noreply, Timer};
handle_cast({shutdown_node, #node_info{} = NodeStatus}, Timer) ->
case mnesia:dirty_delete_object(NodeStatus) of
{aborted, Reason} ->
error_logger:warning_msg("transaction shutdown_node failed: ~p", [Reason]);
_ ->
ok
end,
{noreply, Timer};
handle_cast(Message, Timer) ->
error_logger:warning_msg("~p: received unknown message ~p", [?MODULE, Message]),
{noreply, Timer}.

handle_info({nodedown, Node}, Timer) ->
monitor_node(Node, false),
error_logger:info_msg("~p: node ~p down", [?MODULE, Node]),
mnesia:transaction(fun mnesia:delete/3, [node_info, Node, write]),
{noreply, Timer};
handle_info(Message, Timer) ->
error_logger:warning_msg("~p: received unknown message ~p", [?MODULE, Message]),
{noreply, Timer}.

最佳答案

您已经完成了 monitor_node(NodeName, true) **INSIDE** mnesia 事务。

我觉得是因为monitor_node内部会涉及到(I/O操作)消息通信。不适合把这行放在transaction里面。它可能会向相关节点发送'registered' 消息。因此,当节点关闭时,已收到 'nodedown' 消息。

    If a process has made two calls to monitor_node(Node, true) and Node terminates, 
**two nodedown messages are delivered to the process.** If there is no connection
to Node, there will be an attempt to create one. If this fails, a nodedown
message is delivered.

请将行移出 transaction 或仅使用 "CASE" 表达式,然后重试。

register_node(#node_info{node = NodeName} = NodeInfo) ->
case mnesia:read(node_info, NodeName) of
[] ->
monitor_node(NodeName, true),
error_logger:info_msg("node ~p registered", [NodeName]);
[_OldInfo] ->
error_logger:trace_msg("info of node ~p updated", [NodeName])
end,
mnesia:write(NodeInfo).
handle_cast({register_node, #node_info{} = NodeStatus}, Timer) ->
case mnesia:transaction(fun register_node/1, [NodeStatus]) of
{aborted, Reason} ->
error_logger:warning_msg("transaction register_node failed: ~p", [Reason]);
_ ->
ok
end,
{noreply, Timer};

explanation of side-effect in mnesia transaction

Mnesia dynamically sets and releases locks as transactions execute, therefore, it is very dangerous to execute code with transaction side-effects. In particular, a receive statement inside a transaction can lead to a situation where the transaction hangs and never returns, which in turn can cause locks not to release. This situation could bring the whole system to a standstill since other transactions which execute in other processes, or on other nodes, are forced to wait for the defective transaction.

关于erlang - erlang 集群中的多个 nodedown 消息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13577253/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com