gpt4 book ai didi

multithreading - Java Security API 同步方法导致应用程序线程在高负载时挂起

转载 作者:太空宇宙 更新时间:2023-11-03 14:27:17 25 4
gpt4 key购买 nike

我们的后端服务器使用嵌入式 Jetty 8.1.15 已经有几年了。直到最近,当我们开始对大量并发用户进行负载测试时,它才出现任何问题。然后,即使使用 JMeter(具有 1000 个并发池和 KeepAlive 的 HTTP 采样器)的少量用户,我们也成功重现了该问题。客户端和服务器之间的通信是通过 TLS(在客户端有连接池)客户端(连接池)-> TLS -> 服务器我们看到的行为 - 在某个时间点,许多线程被以下堆栈跟踪“卡住”(等待不同方法的监视器) -

Thread "qtp438123546-99":
at java.security.SecureRandom.nextBytes(byte[ ]) (line: 457)
at sun.security.ssl.RandomCookie.<init>(java.security.SecureRandom) (line: 53)
at sun.security.ssl.ServerHandshaker.clientHello(sun.security.ssl.HandshakeMessage$ClientHello) (line: 522)
at sun.security.ssl.ServerHandshaker.processMessage(byte, int) (line: 213)
at sun.security.ssl.Handshaker.processLoop() (line: 925)
at sun.security.ssl.Handshaker$1.run() (line: 865)
at sun.security.ssl.Handshaker$1.run() (line: 862)
at java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext)
at sun.security.ssl.Handshaker$DelegatedTask.run() (line: 1302)
at org.eclipse.jetty.io.nio.SslConnection.process(org.eclipse.jetty.io.Buffer, org.eclipse.jetty.io.Buffer) (line: 375)
at org.eclipse.jetty.io.nio.SslConnection.access$900(org.eclipse.jetty.io.nio.SslConnection, org.eclipse.jetty.io.Buffer, org.eclipse.jetty.io.Buffer) (line: 48)
at org.eclipse.jetty.io.nio.SslConnection$SslEndPoint.fill(org.eclipse.jetty.io.Buffer) (line: 678)
at org.eclipse.jetty.http.HttpParser.fill() (line: 1044)
at org.eclipse.jetty.http.HttpParser.parseNext() (line: 280)
at org.eclipse.jetty.http.HttpParser.parseAvailable() (line: 235)
at org.eclipse.jetty.server.AsyncHttpConnection.handle() (line: 82)
at org.eclipse.jetty.io.nio.SslConnection.handle() (line: 196)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle() (line: 696)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run() (line: 53)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(java.lang.Runnable) (line: 608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run() (line: 543)
at java.lang.Thread.run() (line: 745)

或有时在其他 Java 安全同步 API 方法上

Thread "qtp438123546-993":
at sun.security.ssl.SignatureAndHashAlgorithm.getSupportedAlgorithms(java.security.AlgorithmConstraints) (line: 155)
at sun.security.ssl.Handshaker.getLocalSupportedSignAlgs() (line: 422)
at sun.security.ssl.ServerHandshaker.clientHello(sun.security.ssl.HandshakeMessage$ClientHello) (line: 700)
at sun.security.ssl.ServerHandshaker.processMessage(byte, int) (line: 213)
at sun.security.ssl.Handshaker.processLoop() (line: 925)
at sun.security.ssl.Handshaker$1.run() (line: 865)
at sun.security.ssl.Handshaker$1.run() (line: 862)
at java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext)
at sun.security.ssl.Handshaker$DelegatedTask.run() (line: 1302)
at org.eclipse.jetty.io.nio.SslConnection.process(org.eclipse.jetty.io.Buffer, org.eclipse.jetty.io.Buffer) (line: 375)
at org.eclipse.jetty.io.nio.SslConnection.access$900(org.eclipse.jetty.io.nio.SslConnection, org.eclipse.jetty.io.Buffer, org.eclipse.jetty.io.Buffer) (line: 48)
at org.eclipse.jetty.io.nio.SslConnection$SslEndPoint.fill(org.eclipse.jetty.io.Buffer) (line: 678)
at org.eclipse.jetty.http.HttpParser.fill() (line: 1044)
at org.eclipse.jetty.http.HttpParser.parseNext() (line: 280)
at org.eclipse.jetty.http.HttpParser.parseAvailable() (line: 235)
at org.eclipse.jetty.server.AsyncHttpConnection.handle() (line: 82)
at org.eclipse.jetty.io.nio.SslConnection.handle() (line: 196)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle() (line: 696)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run() (line: 53)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(java.lang.Runnable) (line: 608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run() (line: 543)
at java.lang.Thread.run() (line: 745)

当一切正常时,在问题出现之前,客户端和服务器之间有一个连接池与持久连接(可以在netstat中看到),但是当问题出现时有很多不同状态的连接,其他已建立:

在客户端:

tcp6       0      0 client-host:39962       server-host:443         FIN_WAIT2
tcp6 0 1 client-host:45266 server-host:443 SYN_SENT
tcp6 0 0 client-host:46234 server-host:443 FIN_WAIT2
tcp6 0 0 client-host:38892 server-host:443 FIN_WAIT2
tcp6 0 0 client-host:39160 server-host:443 FIN_WAIT2
tcp6 0 0 client-host:39188 server-host:443 FIN_WAIT2
tcp6 0 1 client-host:43496 server-host:443 SYN_SENT
tcp6 0 0 client-host:46122 server-host:443 FIN_WAIT2
tcp6 0 1 client-host:44938 server-host:443 SYN_SENT
tcp6 0 0 client-host:46446 server-host:443 ESTABLISHED

在服务器端:

tcp        0   2980 server-host:443         client-host-1:34964       LAST_ACK
tcp 0 2980 server-host:443 client-host-3:52430 LAST_ACK
tcp 0 2980 server-host:443 client-host-1:35922 LAST_ACK
tcp 0 0 server-host:443 client-host-1:38362 CLOSE_WAIT
tcp 236 0 server-host:443 client-host-3:58296 CLOSE_WAIT
tcp 0 2980 server-host:443 client-host-1:34980 LAST_ACK
tcp 0 2980 server-host:443 client-host-2:55748 LAST_ACK
tcp 0 2980 server-host:443 client-host-3:53376 LAST_ACK
tcp 0 0 server-host:443 client-host-1:40104 SYN_RECV
tcp 0 0 server-host:443 client-host-1:38718 CLOSE_WAIT
tcp 0 2980 server-host:443 client-host-2:54142 LAST_ACK
tcp 0 2980 server-host:443 client-host-1:50766 LAST_ACK
tcp 0 0 server-host:443 client-host-1:38604 CLOSE_WAIT
tcp 236 0 server-host:443 client-host-3:57604 CLOSE_WAIT
tcp 0 2980 server-host:443 client-host-2:55502 LAST_ACK
tcp 0 2980 server-host:443 client-host-2:58254 LAST_ACK
tcp 0 2980 server-host:443 client-host-1:38042 LAST_ACK
tcp 0 0 server-host:443 client-host-1:38222 CLOSE_WAIT
tcp 0 2980 server-host:443 client-host-3:47812 LAST_ACK
tcp 0 2980 server-host:443 client-host-1:60532 LAST_ACK
tcp 0 2980 server-host:443 client-host-2:54282 LAST_ACK
tcp 0 0 server-host:443 client-host-1:40978 SYN_RECV

几乎所有线程都没有响应,CPU占用率很高,GC一直在工作

enter image description here

enter image description here

enter image description here

我们还在 JVM 中设置了以下标志:

-Djava.security.egd=file:/dev/./urandom

为了 SecureRandom 将是非阻塞的(与/dev/random 相反)

Java 版本“1.8.0_05”Java(TM) SE 运行时环境(build 1.8.0_05-b13)Java HotSpot(TM) 64 位服务器 VM(构建 25.5-b02,混合模式)

内核:4.14.94-89.73.amzn2.x86_64(但问题也出现在内核为2.6.32-696.20.1.el6.x86_64的系统上)

限制:

core file size          (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 151551
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 16384
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

当组件进入该状态时,它没有响应,日志没有写入日志文件(顺便说一句,我们使用 log4j2)

当组件上的负载停止时,组件需要几分钟才能恢复并再次响应

有没有人在您的 Java 后端组件中有过类似的行为?请评论/建议调查方向和/或解决方案

最佳答案

SecureRandom 尝试从操作系统提供的随机源(例如 /dev/random)读取随机字节,但是如果系统不这样做,此操作可能会挂起有足够的可用熵。

解决方法是使用另一个不会阻塞的随机源(例如 /dev/urandom)。这可以通过更新 $JAVA_HOME/jre/lib/security/java.security 进行配置:

securerandom.source=file:/dev/./urandom

另一种选择是安装 Haveged这可以加快从 /dev/random 的读取速度。

此错误报告包含更多详细信息:https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6521844

关于multithreading - Java Security API 同步方法导致应用程序线程在高负载时挂起,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56012478/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com