gpt4 book ai didi

java - 使用 Elastic Cloud/Found 随机断开与主节点 NoNodeAvailableException 的连接

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:24:43 33 4
gpt4 key购买 nike

我正在使用带防护罩和传输 Java 客户端的弹性云(以前发现的)。与 ES 通信的应用程序运行在 heroku 上。我正在使用一个节点在暂存环境中运行压力测试

{
"cluster_name": ...,
"status": "yellow",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 19,
"active_shards": 19,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 7,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0
}

一开始一切都很完美。但是一段时间后(3-4 分钟)我开始出现一些错误。我已将日志级别设置为跟踪,这些是我遇到的错误(我已将所有不相关的内容替换为 ...

org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes were available: [[...][...][...][inet[...]]{logical_availability_zone=..., availability_zone=..., max_local_storage_nodes=1, region=..., master=true}]
at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:242)
at org.elasticsearch.action.TransportActionNodeProxy$1.handleException(TransportActionNodeProxy.java:78)
at org.elasticsearch.transport.TransportService$3.run(TransportService.java:290)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.transport.SendRequestTransportException: [...][inet[...]][indices:data/read/search]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:286)
at org.elasticsearch.shield.transport.ShieldClientTransportService.sendRequest(ShieldClientTransportService.java:41)
at org.elasticsearch.action.TransportActionNodeProxy.execute(TransportActionNodeProxy.java:57)
at org.elasticsearch.client.transport.support.InternalTransportClient$1.doWithNode(InternalTransportClient.java:109)
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:205)
at org.elasticsearch.client.transport.support.InternalTransportClient.execute(InternalTransportClient.java:106)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:334)
at org.elasticsearch.client.transport.TransportClient.search(TransportClient.java:416)
at org.elasticsearch.action.search.SearchRequestBuilder.doExecute(SearchRequestBuilder.java:1122)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
...
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [...][inet[...]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:936)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:629)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:276)
...

这些是我的属性

  settings = ImmutableSettings.settingsBuilder()
.put("client.transport.nodes_sampler_interval", "5s") //Tried it with 30s, same outcome
.put("client.transport.ping_timeout", "30s")
.put("cluster.name", clusterName)
.put("action.bulk.compress", false)
.put("shield.transport.ssl", true)
.put("request.headers.X-Found-Cluster", clusterName)
.put("shield.user", user + ":" + password)
.put("transport.ping_schedule", "1s") //Tried with 5s, same outcome
.build();

我还为我进行的每个查询设置了:

max_query_response_size=100000
timeout_seconds=30

我正在使用 ElasticSearch 1.7.2Shield 1.3.2 以及相应的(相同版本)客户端,Java 1.8.0_65我的机器 - Java 1.8.0_40 在节点上。

我在没有压力测试的情况下遇到了同样的错误,但这些错误是随机发生的,所以我想重现。这就是我在单个节点中运行它的原因。

我在日志中发现了另一个错误

2016-03-07 23:35:52,177 DEBUG [elasticsearch[Vermin][transport_client_worker][T#7]{New I/O worker #16}] ssl.SslHandler (NettyInternalESLogger.java:debug(63)) - Swallowing an exception raised while writing non-app data
java.nio.channels.ClosedChannelException
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.cleanUpWriteBuffer(AbstractNioWorker.java:433)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:373)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

热门话题

0.0% (111.6micros out of 500ms) cpu usage by thread 'elasticsearch[...][transport_client_timer][T#1]{Hashed wheel timer #1}'
10/10 snapshots sharing following 5 elements
java.lang.Thread.sleep(Native Method)
org.elasticsearch.common.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:445)
org.elasticsearch.common.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:364)
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
java.lang.Thread.run(Thread.java:745)

看完这篇http://blog.trifork.com/2015/04/08/dealing-with-nodenotavailableexceptions-in-elasticsearch/我开始更好地理解整个通信是如何工作的。我还没有测试过这个,但我相信问题就在那里。问题是,即使我确认问题是关闭的查询连接,我该如何处理呢?保持配置不变,然后重新连接?我是否禁用 keepAlive?如果是,我应该担心其他事情吗?

最佳答案

引用此链接: https://discuss.elastic.co/t/nonodeavailableexception-with-java-transport-client/37702通过 Konrad Beiske

your application could be resolving the ip address at boot time. The ELB can change ip's at any point in time. For the best reliability your application should add all ip's of the ELB to the client and periodically check the DNS service for changes.

The connection timeout of our ELB's are 5 minutes.

以下应该可以帮助您修复它:

Creating a new TransportClient for every request is not ideal as it will imply a new connection handshake for every request and this will hurt your response time. You could have a pool of TransportClients if you prefer, but it will most likely be an unnecessary overhead as the client is thread safe.

My suggestion is that you create a small singleton service that periodically checks for changes to the DNS service and adds any new ip's to your existing transport client. In theory it could be as naive as just adding all ip's discovered every time it checks as the transport client will discard duplicate addresses and also purges old addresses no longer reachable.

关于java - 使用 Elastic Cloud/Found 随机断开与主节点 NoNodeAvailableException 的连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35854877/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com