gpt4 book ai didi

keycloak - 传达 Infinispan 远程异常会产生过多的网络流量

转载 作者:行者123 更新时间:2023-12-01 21:51:57 26 4
gpt4 key购买 nike

当我们的Infinispan集群(版本9.4.8.Final)出现异常时,出现异常的节点会将此信息发送给集群中的其他节点。这似乎是设计使然。

此事件可能会导致流量过大,从而导致超时异常,进而使节点想要将其超时异常传达给其他节点。在生产中,我们的 3 节点 Infinispan 集群完全饱和了 20 Gb/s 的链路。

例如,在 2 节点 QA 集群中,我们观察到以下情况:

节点 1:

ISPN000476:WAITING来自 node2 的请求 7861 的响应超时

节点 2:

ISPN000217:从 node1 接收到异常,查看远程堆栈跟踪的原因

进一步向下打印在节点 2 上的堆栈跟踪,我们看到:

等待来自 node2 的请求 7861 的响应超时

有很多这样的。我们在此期间捕获了一个数据包,可以看到有 50 KB 的数据包,其中包含远程错误列表及其整个 Java 堆栈跟踪。

当这种情况发生时,它就是某种“完美 Storm ”。每次超时都会产生一个通过网络发送的错误。这会增加拥塞和超时。从那里开始,事情变得非常迅速。

我知道我需要解决超时问题 - 寻找 GC 收集暂停等,并且可能考虑增加超时。但是,我想知道在这些事件发生时是否有办法阻止这种行为的发生。仔细想想,节点 1 与节点 2 的对话超时,然后通过网络向节点 2 发送一份错误副本,告诉它“我与你对话超时”,这似乎很奇怪。

有什么方法可以避免传输这些远程堆栈跟踪?非常感谢您提供任何见解或建议。

编辑

示例堆栈跟踪:

2019-12-06 11:37:01,587 ERROR [org.keycloak.services.error.KeycloakErrorHandler] (default task-26) Uncaught server error: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from ********, see cause for remote stack trace
at org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:28)
at org.infinispan.remoting.transport.ValidSingleResponseCollector.withException(ValidSingleResponseCollector.java:37)
at org.infinispan.remoting.transport.ValidSingleResponseCollector.addResponse(ValidSingleResponseCollector.java:21)
at org.infinispan.remoting.transport.impl.SingleTargetRequest.receiveResponse(SingleTargetRequest.java:52)
at org.infinispan.remoting.transport.impl.SingleTargetRequest.onResponse(SingleTargetRequest.java:35)
at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1372)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1275)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:126)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1420)
at org.jgroups.JChannel.up(JChannel.java:816)
at org.jgroups.fork.ForkProtocolStack.up(ForkProtocolStack.java:133)
at org.jgroups.stack.Protocol.up(Protocol.java:340)
at org.jgroups.protocols.FORK.up(FORK.java:141)
at org.jgroups.protocols.FRAG3.up(FRAG3.java:171)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:872)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240)
at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1008)
at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:734)
at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:389)
at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:590)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:131)
at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:203)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:253)
at org.jgroups.protocols.MERGE3.up(MERGE3.java:280)
at org.jgroups.protocols.Discovery.up(Discovery.java:295)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1249)
at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.jboss.as.clustering.jgroups.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:52)
at java.lang.Thread.run(Thread.java:745)
Suppressed: org.infinispan.util.logging.TraceException
at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:41)
at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250)
at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1918)
at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1433)
at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:685)
at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:240)
at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
at org.infinispan.cache.impl.EncoderCache.put(EncoderCache.java:195)
at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
at org.keycloak.cluster.infinispan.InfinispanNotificationsManager.notify(InfinispanNotificationsManager.java:155)
at org.keycloak.cluster.infinispan.InfinispanClusterProvider.notify(InfinispanClusterProvider.java:130)
at org.keycloak.models.cache.infinispan.CacheManager.sendInvalidationEvents(CacheManager.java:206)
at org.keycloak.models.cache.infinispan.UserCacheSession.runInvalidations(UserCacheSession.java:140)
at org.keycloak.models.cache.infinispan.UserCacheSession$1.commit(UserCacheSession.java:152)
at org.keycloak.services.DefaultKeycloakTransactionManager.commit(DefaultKeycloakTransactionManager.java:146)
at org.keycloak.services.resources.admin.UsersResource.createUser(UsersResource.java:125)
at sun.reflect.GeneratedMethodAccessor487.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:139)
at org.jboss.resteasy.core.ResourceMethodInvoker.internalInvokeOnTarget(ResourceMethodInvoker.java:510)
at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTargetAfterFilter(ResourceMethodInvoker.java:400)
at org.jboss.resteasy.core.ResourceMethodInvoker.lambda$invokeOnTarget$0(ResourceMethodInvoker.java:364)
at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355)
at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:366)
at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:338)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:137)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:106)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:132)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:106)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:132)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:100)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:439)
at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:229)
at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:135)
at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355)
at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:138)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:215)
at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:227)
at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:56)
at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:51)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:791)
at io.undertow.servlet.handlers.ServletHandler.handleRequest(ServletHandler.java:74)
at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:129)
at org.keycloak.services.filters.KeycloakSessionServletFilter.doFilter(KeycloakSessionServletFilter.java:90)
at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
at io.undertow.servlet.handlers.FilterHandler.handleRequest(FilterHandler.java:84)
at io.undertow.servlet.handlers.security.ServletSecurityRoleHandler.handleRequest(ServletSecurityRoleHandler.java:62)
at io.undertow.servlet.handlers.ServletChain$1.handleRequest(ServletChain.java:68)
at io.undertow.servlet.handlers.ServletDispatchingHandler.handleRequest(ServletDispatchingHandler.java:36)
at org.wildfly.extension.undertow.security.SecurityContextAssociationHandler.handleRequest(SecurityContextAssociationHandler.java:78)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.servlet.handlers.security.SSLInformationAssociationHandler.handleRequest(SSLInformationAssociationHandler.java:132)
at io.undertow.servlet.handlers.security.ServletAuthenticationCallHandler.handleRequest(ServletAuthenticationCallHandler.java:57)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.security.handlers.AbstractConfidentialityHandler.handleRequest(AbstractConfidentialityHandler.java:46)
at io.undertow.servlet.handlers.security.ServletConfidentialityConstraintHandler.handleRequest(ServletConfidentialityConstraintHandler.java:64)
at io.undertow.security.handlers.AuthenticationMechanismsHandler.handleRequest(AuthenticationMechanismsHandler.java:60)
at io.undertow.servlet.handlers.security.CachedAuthenticatedSessionHandler.handleRequest(CachedAuthenticatedSessionHandler.java:77)
at io.undertow.security.handlers.NotificationReceiverHandler.handleRequest(NotificationReceiverHandler.java:50)
at io.undertow.security.handlers.AbstractSecurityContextAssociationHandler.handleRequest(AbstractSecurityContextAssociationHandler.java:43)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at org.wildfly.extension.undertow.security.jacc.JACCContextIdHandler.handleRequest(JACCContextIdHandler.java:61)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at org.wildfly.extension.undertow.deployment.GlobalRequestControllerHandler.handleRequest(GlobalRequestControllerHandler.java:68)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:292)
at io.undertow.servlet.handlers.ServletInitialHandler.access$100(ServletInitialHandler.java:81)
at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:138)
at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:135)
at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:48)
at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:272)
at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:104)
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:364)
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1363)
... 1 more
Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 7865 from ********
at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more

最佳答案

我们能够解决导致网络流量大幅飙升的问题。详情如下。

tl;dr:

我们从 JGroups UDP 堆栈切换到 TCP 堆栈,ISPN 文档提到,对于使用像我们这样的分布式缓存的小型集群来说,TCP 堆栈可以更高效。

重现问题

为了重现该问题,我们执行了以下操作:

  • 将清除 ISPN 缓存条目的 Keycloak 后台作业配置为每 60 秒运行一次,这样我们就不必等待该作业以 15 分钟的间隔运行 (standalone-ha.xml):

    <scheduled-task-interval>60</scheduled-task-interval>

  • 生成大量用户 session (我们使用了 jmeter)。在我们的测试中,我们最终生成了大约 100,000 个 session 。

  • 将 Keycloak 中的 SSO Idle 和 Max time TTL 配置为极短,以便所有 session 都将过期(60 秒)
  • 继续用jmeter给系统加负载(这部分很重要)

当清除缓存的作业运行时,我们会看到网络充满流量 (120 MB/s)。发生这种情况时,我们会在集群中的每个节点上看到大量以下错误:

ISPN000476: Timed out waiting for responses for request 7861 from node2

ISPN000217: Received exception from node1, see cause for remote stack trace

专业提示:将钝化配置到文件存储以持久保存您的 ISPN 数据。关闭集群并将“.dat”文件保存在别处。使用这些文件在测试之间立即恢复 ISPN 集群的状态。

解决问题

使用上述技术,我们能够根据需要重现问题。因此,我们着手使用下述方法解决它。

更改 JGroups 堆栈以使用 TCP

我们将 JGroups 堆栈从 UDP 更改为 TCP,还配置了 TCPPing 以进行发现。我们在阅读了以下指南中对 TCP 堆栈的描述后执行了此操作:

https://infinispan.org/docs/stable/titles/configuring/configuring.html#preconfigured_jgroups_stacks-configuring

具体来说:

"Uses TCP for transport and UDP multicast for discovery. Suitable for smaller clusters (under 100 nodes) only if you are using distributed caches because TCP is more efficient than UDP as a point-to-point protocol."

这一变化完全解决了我们的问题

我们在 standalone-ha.xml 中的 Wildfly 16 配置如下:

<subsystem xmlns="urn:jboss:domain:jgroups:6.0">
<channels default="ee">
<channel name="ee" stack="tcp" cluster="ejb"/>
</channels>
<stacks>
<stack name="udp">
<transport type="UDP" socket-binding="jgroups-udp"/>
<protocol type="PING"/>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK"/>
<protocol type="FD_ALL"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="UFC"/>
<protocol type="MFC"/>
<protocol type="FRAG3"/>
</stack>
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp"/>
<socket-protocol type="TCPPING" socket-binding="jgroups-tcp">
<property name="initial_hosts">HOST-X[7600],HOST-Y[7600],HOST-Z[7600]</property>
<property name="port_range">1</property>
</socket-protocol>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK"/>
<protocol type="FD_ALL"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="MFC"/>
<protocol type="FRAG3"/>
</stack>
</stacks>
</subsystem>

调整 JVM 垃圾收集器参数

我们遵循了 ISPN 调整指南中的一些建议:

https://infinispan.org/docs/stable/titles/tuning/tuning.html

特别是,我们从 JDK 8 的默认 GC 更改为使用 CMS 收集器。具体来说,我们的 Wildfly 服务器现在提供了以下 JVM 参数:

-Xms6144m -Xmx6144m -Xmn1536M -XX:MetaspaceSize=192M -XX:MaxMetaspaceSize=512m -Djava.net.preferIPv4Stack=true -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+DisableExplicitGC

杂项变更

我们应用了其他特定于我们环境的更改:

  • 在 iptables 级别阻止 IP 多播 + UDP(因为我们想确保我们仅支持 TCP)
  • 在网络级别配置带宽上限,以防止 ISPN 集群使网络饱和并影响使用同一链路的其他主机。

关于keycloak - 传达 Infinispan 远程异常会产生过多的网络流量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59236742/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com