gpt4 book ai didi

.net - 是什么会导致这么多未启动的线程?

转载 作者:行者123 更新时间:2023-12-03 12:20:26 26 4
gpt4 key购买 nike

现在,我很奇怪地遇到了一个错误。

我的应用程序是一个Winform客户端,需要使用WCF连接到服务器。我的应用程序将引用几个.net和c++模块/dll。

由于某种原因,我在代码中设置了ThreadPool.SetMaxThreads(150, 200)。运行几个小时后,该客户端将与服务器断开连接。

用windbg调试后,我发现线程池中已经充满了许多奇怪的线程。因此,无法在线程池中创建新线程,而且我认为WCF也无法创建线程与服务器连接,从而导致断开连接。

这些奇怪的线程如下所示:

                                                                         Lock  
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt
XXXX 3 cb8 0043afd8 1400 Preemptive 00000000:00000000 003f3248 0 Ukn

根据 Yun Jin's WebLog "Thread, System.Threading.Thread, and !Threads" seriesSSCLI 2.0 source code的说法,生成这些线程的最高可能性是CLR在线程池中创建一个新线程,并且该线程将永远无法恢复。

我想知道 为何失败或如何恢复线程或很多线程失败

以下是更多技术详细信息:

当CLR在线程池中创建新线程时,它将调用 SetupUnstartedThread方法和 CreateNewThread/CreateNewOSThread方法。

SetupUnstartedThread之后,CLR将创建一个这样的线程
                                                                         Lock  
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt
XXXX 3 0 0043afd8 1400 Preemptive 00000000:00000000 003f3248 0 Ukn

具有 0x1400 (TS_Unstarted | TS_WeOwn)状态且没有OSID且没有调试器ID(XXXX)

CreateNewThread/CreateNewOSThread之后,该线程将变为
                                                                         Lock  
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt
XXXX 3 cb8 0043afd8 1400 Preemptive 00000000:00000000 003f3248 0 Ukn

既有OSID,又没有调试器ID(XXXX)

此外,线程的 ExposedObject字段为null。

但是,如果线程成功恢复,则意味着调用了 ntdll!LdrInitializeThunk ,该线程将获得调试器ID(2)
                                                                         Lock  
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt
2 3 cb8 0043afd8 1400 Preemptive 00000000:00000000 003f3248 0 Ukn

线程的状态不同于错误的状态(没有调试器ID)

编辑为Thomas W

如果您提到的选项c是

(c) a special OS thread in CLR which might run managed code.



根据 SSCLI 2.0 source code,如果OS线程要访问托管代码,则CLR将调用 SetupThread方法,该方法将运行以下代码
// reset any unstarted bits on the thread object
FastInterlockAnd((ULONG *) &pThread->m_State, ~Thread::TS_Unstarted);
FastInterlockOr((ULONG *) &pThread->m_State, Thread::TS_LegalToJoin);

哪个绝对不是 0x1400
任何一个奇怪的线程在 ~线程列表中都没有相应的线程。因此,您无法在 !runaway中看到它们

编辑2

很抱歉最近更新了此信息。尚未找到根本原因,但是找到了一种解决方法,即将 .Net Framework 4.0 替换为 .Net Framework 4.5

以下内容将描述有关如何找到解决方法的更多详细信息。

从前,我一直在追踪这些奇怪线程的整个生命周期。我们都知道CLR中有一个 Gate Thread (thread help to monitor status of completion port threads and worker threads, only one)。当我的应用程序开始出错时,Gate Thread将调用 clr!ThreadpoolMgr::CreateWorkerThread周期性,这将创建一个新的clr线程对象和一个新的os线程对象。
0:004> k
ChildEBP RetAddr
04c8f6f8 6f3ea8ff KERNEL32!CreateThreadStub
04c8f744 6f3ea77b clr!Thread::CreateNewOSThread+0xba
04c8f78c 6f3eabc1 clr!Thread::CreateNewThread+0xa9
04c8f81c 6f4a6aed clr!ThreadpoolMgr::CreateUnimpersonatedThread+0xbb
04c8f83c 6f4a560e clr!ThreadpoolMgr::CreateWorkerThread+0x19
04c8f864 6f4a4457 clr!ThreadpoolMgr::EnsureEnoughWorkersWorking+0x116
04c8f94c 75973c45 clr!ThreadpoolMgr::GateThreadStart+0x431
04c8f958 771a37f5 KERNEL32!BaseThreadInitThunk+0xe
04c8f998 771a37c8 ntdll!__RtlUserThreadStart+0x70
04c8f9b0 00000000 ntdll!_RtlUserThreadStart+0x1b

新线程看起来像这样
                                                                         Lock  
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt
XXXX 3 cb8 0043afd8 1400 Preemptive 00000000:00000000 003f3248 0 Ukn

我猜想这个线程可能永远不会恢复。原来我错了。不久之后,该线程分别调用了 ntdll!LdrInitializeThunkntdll!_RtlUserThreadStart
0:065> k
ChildEBP RetAddr
1d54f7c0 75973c45 clr!Thread::intermediateThreadProc
1d54f7cc 771a37f5 KERNEL32!BaseThreadInitThunk+0xe
1d54f80c 771a37c8 ntdll!__RtlUserThreadStart+0x70
1d54f824 00000000 ntdll!_RtlUserThreadStart+0x1b
Lock
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt
65 3 cb8 0043afd8 1400 Preemptive 00000000:00000000 003f3248 0 Ukn

检查 clr!Thread::intermediateThreadProc的参数后,我发现此线程将调用 clr!ThreadpoolMgr::WorkerThreadStart

然后魔术发生了。
clr!ThreadpoolMgr::WorkerThreadStart结束后,通常应由 Finalizer线程调用 clr!ThreadStore::RemoveThread,然后再终止线程。 但是这次没有。

没有 clr!ThreadStore::RemoveThread,只有
0:065> k
ChildEBP RetAddr
1889fb04 7716f73a ntdll!LdrpCallInitRoutine+0x14
1889fba8 7716f63b ntdll!LdrShutdownThread+0xe6
1889fbb8 75973c4c ntdll!RtlExitUserThread+0x2a
1889fbc4 771a37f5 KERNEL32!BaseThreadInitThunk+0x15
1889fc04 771a37c8 ntdll!__RtlUserThreadStart+0x70
1889fc1c 00000000 ntdll!_RtlUserThreadStart+0x1b

因此相应的os线程已被破坏,但clr线程也存在。
                                                                         Lock  
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt
XXXX 3 cb8 0043afd8 1400 Preemptive 00000000:00000000 003f3248 0 Ukn

也许你会问为什么线程的状态没有改变。由于某种原因,那时我还没有更深入地了解 clr!ThreadpoolMgr::WorkerThreadStart。因此,我无法给您答案,但我也再次阅读了 SSCLI 2.0 source code,并再次进行了猜测(^ _ ^)。
clr!ThreadpoolMgr::WorkerThreadStart将调用'clr!SetupThreadPoolThreadNoThrow'。以下是“clr!SetupThreadPoolThreadNoThrow”的代码段。
EX_TRY
{
pThread = SetupThreadPoolThread(typeTPThread);
}
EX_CATCH
{
if (pHR)
{
*pHR = GET_EXCEPTION()->GetHR();
}
}
EX_END_CATCH(SwallowAllExceptions);

请注意“ SwallowAllExceptions ”。然后,您可以看到此方法将调用 clr!SetupThreadPoolThread。再次显示代码段。
if (NULL == (pThread = GetThread()))
{
pThread = SetupInternalThread();
}
if ((pThread != NULL) && ((pThread->m_State & Thread::TS_ThreadPoolThread) == 0))
{

if (typeTPThread == WorkerThread)
{
FastInterlockOr((ULONG *) &pThread->m_State, Thread::TS_ThreadPoolThread | Thread::TS_TPWorkerThread);
}
else if (typeTPThread == CompletionPortThread)
{
FastInterlockOr ((ULONG *) &pThread->m_State, Thread::TS_ThreadPoolThread | Thread::TS_CompletionPortThread);
}
else
{
FastInterlockOr((ULONG *) &pThread->m_State, Thread::TS_ThreadPoolThread);
}
}

然后,我猜想在调用 clr!SetupInternalThread时是否发生了异常,线程的状态将没有机会被更改。

所以这是我第一次认为.net框架中可能有一个轻微的缺陷,该缺陷只是由我的应用程序触发的。同时,我的一位同事告诉我,他无法重现该错误。在检查了他的环境之后,我发现他使用了 .Net Framework 4.5

到目前为止,升级.net框架后,该错误不再发生。

最佳答案

SSCCE用于分析线程

若要查看.NET如何创建托管线程并将其标记为XXX,可以运行以下代码。在调试版本中编译应用程序,启动WinDbg并在调试器下运行该应用程序。在初始断点处,运行以下命令:

sxe -c ".loadby sos clr;g" ld clr.dll;.ocommand OCOMMAND;g

然后,该应用程序将进行自我调试,您将看到线程在变化。
Step                .NET threads  Unstarted  Dead     Thread objects  Native threads
1 (before started) 2 0 0 1 4
2 (Thread started) 3 1 (XXX) 0 2 5
3 (Thread running) 3 0 0 3 8
4 (Thread ended) 3 0 1 (XXX) 2 7
5 (GC ran) 3 0 1 (XXX) 2 4

SSCCE代码:
using System;
using System.Diagnostics;
using System.Threading;

namespace ManagedThreadDebug
{
class Program
{
static void Main()
{
InformDebug("Before creating thread object.");

var t = new Thread(ThreadRun);
InformDebug("After creating thread object and calling Start().");

t.Start();
InformDebug("While thread is running.");

t.Join();
InformDebug("After thread was running (GC potentially not run yet).");

GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
Thread.Sleep(10);
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
Thread.Sleep(10);
InformDebug("After thread was running (GC hopefully ran).");
}

private static void ThreadRun()
{
Thread.Sleep(1000);
}

private static void InformDebug(string message)
{
Console.WriteLine(message);
Trace.WriteLine("OCOMMAND .echo >>> "+message+";!threads;.echo;!dumpheap -stat -type Thread;.echo;~;g");
}
}
}

几乎完整的输出,为简洁起见已缩短:
>>> Before creating thread object.
ThreadCount: 2
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 0
Lock
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 1074 00441310 2a020 Preemptive 02796F48:00000000 00408378 1 MTA
2 2 1fb8 00411258 2b220 Preemptive 00000000:00000000 00408378 0 MTA (Finalizer)

Statistics:
MT Count TotalSize Class Name
69f02e64 1 52 System.Threading.Thread

. 0 Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
1 Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
2 Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
3 Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen

>>> After creating thread object and calling Start().
ThreadCount: 3
UnstartedThread: 1
BackgroundThread: 1
PendingThread: 0
DeadThread: 0
Lock
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 1074 00441310 2a020 Preemptive 02797334:00000000 00408378 1 MTA
2 2 1fb8 00411258 2b220 Preemptive 00000000:00000000 00408378 0 MTA (Finalizer)
XXXX 3 0 00474900 1400 Preemptive 00000000:00000000 00408378 0 Ukn

Statistics:
MT Count TotalSize Class Name
69f02e64 2 104 System.Threading.Thread

. 0 Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
1 Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
2 Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
3 Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen
4 Id: b78.27d8 Suspend: 1 Teb: 7efac000 Unfrozen

>>> While thread is running.
ThreadCount: 3
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 0
Lock
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 1074 00441310 2a020 Preemptive 02797550:00000000 00408378 1 MTA
2 2 1fb8 00411258 2b220 Preemptive 00000000:00000000 00408378 0 MTA (Finalizer)
6 3 1d04 00474900 2b020 Preemptive 00000000:00000000 00408378 1 MTA

Statistics:
MT Count TotalSize Class Name
69f02e64 2 104 System.Threading.Thread

. 0 Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
1 Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
2 Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
3 Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen
4 Id: b78.27d8 Suspend: 1 Teb: 7efac000 Unfrozen
5 Id: b78.2478 Suspend: 1 Teb: 7efa9000 Unfrozen
6 Id: b78.1d04 Suspend: 1 Teb: 7efa6000 Unfrozen
7 Id: b78.1fdc Suspend: 1 Teb: 7efa3000 Unfrozen

>>> After thread was running (GC potentially not run yet).
ThreadCount: 3
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 1
Lock
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 1074 00441310 2a020 Preemptive 027977FC:00000000 00408378 1 MTA
2 2 1fb8 00411258 2b220 Preemptive 00000000:00000000 00408378 0 MTA (Finalizer)
XXXX 3 0 00474900 39820 Preemptive 00000000:00000000 00408378 0 Ukn

Statistics:
MT Count TotalSize Class Name
69f02e64 2 104 System.Threading.Thread

. 0 Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
1 Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
2 Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
3 Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen
4 Id: b78.27d8 Suspend: 1 Teb: 7efac000 Unfrozen
5 Id: b78.2478 Suspend: 1 Teb: 7efa9000 Unfrozen
7 Id: b78.1fdc Suspend: 1 Teb: 7efa3000 Unfrozen

>>> After thread was running (GC hopefully ran).
ThreadCount: 3
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 1
Lock
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 1074 00441310 2a020 Preemptive 02797380:00000000 00408378 1 MTA
2 2 1fb8 00411258 2b220 Preemptive 00000000:00000000 00408378 0 MTA (Finalizer)
XXXX 3 0 00474900 39820 Preemptive 00000000:00000000 00408378 0 Ukn

Statistics:
MT Count TotalSize Class Name
69f02e64 2 104 System.Threading.Thread

. 0 Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
1 Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
2 Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
3 Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen

结论

显示为XXXX的线程可以是未启动的线程也可以是死线程。您可能不喜欢答案:除非您向我们展示一些代码,否则无法确定这些线程的来源。潜在候选人:
  • 代码中的新Thread()语句
  • 使用Parallel.For和类似的
  • 使用ThreadPool
  • 第三方库中的
  • 代码

  • 调试线程启动和退出

    在WinDbg中运行该应用程序,并在启动线程或退出线程时停止。
    sxe ct;sxe et

    然后查看发生这种情况的地方,并特别检查创建线程的代码。如果这还不够具体,您还可以在.NET线程方法上尝试断点。

    关于.net - 是什么会导致这么多未启动的线程?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24666604/

    26 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com