gpt4 book ai didi

java - 混合使用 Log4j 和 commons-logging 会导致 "class loading deadlock"?

转载 作者:塔克拉玛干 更新时间:2023-11-03 02:57:58 25 4
gpt4 key购买 nike

我想我发现了一种情况,其中 log4j a) 直接混合使用和 b) 通过 commons-logging 混合使用会导致某种类加载死锁。我不确定这种情况是否可能发生(JVM 不应该检测到这种情况吗?)以及如何应对。

问题

在我们的构建系统中,我们目前正在按顺序运行我们的单元测试 - 为了加快构建速度,我们显然可以更改它以并行运行我们的单元测试。但是,如果我们这样做,某些构建会遇到执行超时。在分析此类“挂起构建”的线程转储时,我们发现自己处于不同的模块中,大部分时间涉及不同的测试。但它总是归结为两个尝试初始化 Logger 的线程:一个使用 Logger.getLogger(直接使用 log4j),另一个使用 LogFactory.getLog(使用 commons-logging ).

我的分析

所以我们有一个线程(直接使用 log4j 的那个)在这个地方等待:

"pool-1-thread-3" prio=10 tid=0x00007f6528ca6000 nid=0x6f8f in Object.wait() [0x00007f64d9ca6000]
java.lang.Thread.State: RUNNABLE
at org.apache.log4j.LogManager.<clinit>(LogManager.java:82)
at org.apache.log4j.Logger.getLogger(Logger.java:117)
at de.is24.platform.contacts.domain.PhoneNumberFormat.<clinit>(PhoneNumberFormat.java:21)

不幸的是,这是一条相当“拥挤”的线路:

Hierarchy h = new Hierarchy(new RootLogger((Level) Level.DEBUG));

另一个线程(使用 commons-logging)在这里等待:

"pool-1-thread-2" prio=10 tid=0x00007f6528bf9800 nid=0x6f8e in Object.wait() [0x00007f64d9da7000]
java.lang.Thread.State: RUNNABLE
at org.apache.log4j.Priority.<clinit>(Priority.java:45)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:171)
at org.apache.commons.logging.impl.Log4JLogger.class$(Log4JLogger.java:37)
at org.apache.commons.logging.impl.Log4JLogger.<clinit>(Log4JLogger.java:45)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:209)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:351)

这很简单:

final static public Priority FATAL = new Level(FATAL_INT, "FATAL", 0);

所以对我来说,似乎第二个线程正在初始化类 Priority 并等待加载 Level 类。第一个线程可能会尝试加载 Level 类(该行中的其他内容似乎无关),并且随着 Level 类扩展 Priority,等待加载 Priority 类。
我们陷入了僵局。

那么,你能告诉我这个分析是否正确吗?还是我错过了什么?

更新一

我写了一些测试用例,你可以在这里找到它们:https://github.com/sebastiankirsch/classloading

有几个测试用例证明了这个问题:

  1. TestLoadingByClassForName 应该相当快地导致死锁(在我的机器上每三次运行一次)
  2. TestLoadingMixed 表示问题的一个版本被剥离到观察到的堆栈跟踪的最小值;这个失败的频率要低得多(大约是 4 倍)
  3. TestMixedLoggerInstantiation 模仿以下行为:一个类使用 log4j 实例化一个记录器,另一个使用 commons-logging。这里的死锁很难捕捉到,因为涉及的代码更多——我需要添加一个随机 sleep ,这当然需要适应你的机器

这是 TestMixedLoggerInstantiation 测试用例挂起的堆栈跟踪。

Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode):

"UseLog4JLogger" prio=10 tid=0x00007fa1f017d800 nid=0x7bd8 in Object.wait() [0x00007fa1e5ba4000]
java.lang.Thread.State: RUNNABLE
at org.apache.log4j.LogManager.<clinit>(LogManager.java:82)
at org.apache.log4j.Logger.getLogger(Logger.java:117)
at net.tcc.classloading.UseLog4JLogger.run(UseLog4JLogger.java:23)

"UseCommonsLoggingLogFactory" prio=10 tid=0x00007fa1f00e5000 nid=0x7bd7 in Object.wait() [0x00007fa1e5ca4000]
java.lang.Thread.State: RUNNABLE
at org.apache.log4j.Priority.<clinit>(Priority.java:45)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at org.apache.commons.logging.impl.Log4JLogger.class$(Log4JLogger.java:37)
at org.apache.commons.logging.impl.Log4JLogger.<clinit>(Log4JLogger.java:45)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:209)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:351)
at net.tcc.classloading.UseCommonsLoggingLogFactory.run(UseCommonsLoggingLogFactory.java:13)

"ReaderThread" prio=10 tid=0x00007fa1f00ed000 nid=0x7bd6 runnable [0x00007fa1e5da6000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
- locked <0x00000007d7088a00> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
- locked <0x00000007d7088a00> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner$ReaderThread.run(RemoteTestRunner.java:140)

"Low Memory Detector" daemon prio=10 tid=0x00007fa1f009d800 nid=0x7bd4 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007fa1f009b800 nid=0x7bd3 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007fa1f0098800 nid=0x7bd2 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007fa1f0096800 nid=0x7bd1 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007fa1f007a000 nid=0x7bd0 in Object.wait() [0x00007fa1e6c54000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007d7001300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x00000007d7001300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x00007fa1f0078000 nid=0x7bcf in Object.wait() [0x00007fa1e6d55000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007d70011d8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x00000007d70011d8> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007fa1f000c000 nid=0x7bc5 in Object.wait() [0x00007fa1f50b0000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007d730dfd8> (a net.tcc.classloading.UseCommonsLoggingLogFactory)
at java.lang.Thread.join(Thread.java:1186)
- locked <0x00000007d730dfd8> (a net.tcc.classloading.UseCommonsLoggingLogFactory)
at java.lang.Thread.join(Thread.java:1239)
at net.tcc.classloading.TestMixedLoggerInstantiation.test(TestMixedLoggerInstantiation.java:21)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

"VM Thread" prio=10 tid=0x00007fa1f0071800 nid=0x7bce runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007fa1f001f000 nid=0x7bc6 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007fa1f0021000 nid=0x7bc7 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007fa1f0022800 nid=0x7bc8 runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007fa1f0024800 nid=0x7bc9 runnable

"GC task thread#4 (ParallelGC)" prio=10 tid=0x00007fa1f0026800 nid=0x7bca runnable

"GC task thread#5 (ParallelGC)" prio=10 tid=0x00007fa1f0028000 nid=0x7bcb runnable

"GC task thread#6 (ParallelGC)" prio=10 tid=0x00007fa1f002a000 nid=0x7bcc runnable

"GC task thread#7 (ParallelGC)" prio=10 tid=0x00007fa1f002c000 nid=0x7bcd runnable

"VM Periodic Task Thread" prio=10 tid=0x00007fa1f00a8800 nid=0x7bd5 waiting on condition

JNI global references: 1168

重现死锁

https://github.com/sebastiankirsch/classloading下载代码.
TestLoadingByClassForName 应该很容易为你造成死锁(运行几次即可),这是你的系统/JVM 最终会遇到复杂场景的死锁的先决条件。

现在多次运行 TestMixedLoggerInstantiation。注意平均运行时间,打开 UseLog4JLogger 并将 sleep 定时器总和设置为略小于平均运行时间。这最终会导致死锁,但这种情况很少发生。

因此,像这样在命令行上运行它:

for (( ; ; )) do
testExectution
done

与其手动将testExecution 放在一起,只需在测试中设置一个中断点,调试,键入ps -ef | grep TestMixedLoggerInstantiation 在 shell 中,并复制您的 IDE 使用的命令。

最佳答案

我终于在 Java 语言规范 中找到了答案,特别是在第 12.4.2 Detailed Initialization Procedure 章中.
上面写着

[...]
2) If the Class object for C indicates that initialization is in progress for C by some other thread, then [...] block the current thread until informed that the in-progress initialization has completed, [...]
7) Next, if C is a class rather than an interface, and its superclass SC has not yet been initialized, then recursively perform this entire procedure for SC
10) If the execution of the initializers completes normally, [...] label the Class object for C as fully initialized, notify all waiting threads, [...]

因此观察到的行为与 JLS 指定的完全一致。我仍然对没有办法检测到这种僵局感到困惑。以及为什么线程被标记为 RUNNABLE - 但我想这不是要执行的典型字节码,所以谁知道......

如何解决问题

问题的解决方案是摆脱 commons-logging。作为@Robert Johnson指出,这可以轻松通过使用 org.slf4j:jcl-over-slf4j 来完成。我还检查了 SLF 代码:它没有“利用”不幸的 log4j 设计。

关于java - 混合使用 Log4j 和 commons-logging 会导致 "class loading deadlock"?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15543521/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com