c - 来自 helgrind 的分离 pthread 数据竞争-6ren

c - 来自 helgrind 的分离 pthread 数据竞争

转载作者：太空宇宙更新时间：2023-11-04 11:45:34

我有一个更大的多线程软件(专有且无法共享)报告来自 helgrind 的数据争用(请参阅下面的数据争用)。我不能分享这个软件，但我设计了一些测试来演示比赛。

与有问题的实际软件的竞争:

==7746== Possible data race during write of size 1 at 0xAC83697 by thread #4
==7746== Locks held: 2, at addresses 0x583BCD8 0x5846F58
==7746==    at 0x4C3A3CC: mempcpy (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==7746==    by 0x401375F: _dl_allocate_tls_init (dl-tls.c:515)
==7746==    by 0x5053CED: get_cached_stack (allocatestack.c:254)
==7746==    by 0x5053CED: allocate_stack (allocatestack.c:501)
==7746==    by 0x5053CED: pthread_create@@GLIBC_2.2.5 (pthread_create.c:539)
==7746==    by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==7746==    by 0x40BFA6: <redacted symbol names from private project>
==7746==    by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==7746==    by 0x50536B9: start_thread (pthread_create.c:333)
==7746== 
==7746== This conflicts with a previous write of size 1 by thread #10
==7746== Locks held: none
==7746==    at 0x5053622: start_thread (pthread_create.c:265)
==7746==  Address 0xac83697 is in a rw- anonymous segment
==7746==

当软件关闭一系列线程，然后在同一个线程池中重新启动一些新线程时，就会出现这种数据竞争。遗憾的是，我无法提供任何此代码，但是，我相信我能够重现几个示例来证明该问题。

我发现了其他 3 个与此问题相关的问题:

Why does this recursive pthread_create call result in data race?

上面的答案是手动设置/分配堆栈，我不认为这是一个可行的答案，如果是，有人可以解释为什么吗？

Data race during nested thread creation

答案没有任何作用

Data race with detached pthread detected by valgrind

这个没有答案。

编辑:我在这篇文章的底部添加了另一个(不太复杂的)示例，它也可以重现该问题。

我能够将第一个问题中给出的示例重写为可重现性最低的示例，嗯，主要是。

以下代码在我的机器(Ubuntu 16.04.6 LTS)上运行时，大约有 85% 的时间会生成以下数据竞争

运行:

gcc -g ./test.c -o test -lpthread && valgrind --tool=helgrind ./test

==15656== Possible data race during write of size 1 at 0x5C27697 by thread #4
==15656== Locks held: none
==15656==    at 0x4C3A3CC: mempcpy (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==15656==    by 0x401375F: _dl_allocate_tls_init (dl-tls.c:515)
==15656==    by 0x4E47CED: get_cached_stack (allocatestack.c:254)
==15656==    by 0x4E47CED: allocate_stack (allocatestack.c:501)
==15656==    by 0x4E47CED: pthread_create@@GLIBC_2.2.5 (pthread_create.c:539)
==15656==    by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==15656==    by 0x400832: launch (test3.c:22)
==15656==    by 0x4008FC: threadfn3 (test3.c:48)
==15656==    by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==15656==    by 0x4E476B9: start_thread (pthread_create.c:333)
==15656== 
==15656== This conflicts with a previous write of size 1 by thread #2
==15656== Locks held: none
==15656==    at 0x4E47622: start_thread (pthread_create.c:265)
==15656==  Address 0x5c27697 is in a rw- anonymous segment

编辑:我在这篇文章的底部添加了另一个(不太复杂的)示例，它也可以重现该问题。

这是我为重现该问题而构建的程序，信号量不是必需的，但它们似乎大大增加了发生数据竞争的机会。

#include <semaphore.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>

pthread_t t1;
pthread_t t2;
pthread_t t3;
pthread_t t4;

void *threadfn1(void *p);
void *threadfn2(void *p);
void *threadfn3(void *p);
void *threadfn4(void *p);

sem_t sem;
sem_t sem2;
sem_t sem3;

void launch(pthread_t *t, void *(*fn)(void *), void *arg)
{
    pthread_create(t,NULL,fn,arg);
    pthread_detach(*t);
}

void *threadfn1(void *p)
{
    launch(&t2, threadfn2, NULL);
    printf("1 %p\n", p);
    // notify threadfn3 we are done
    sem_post(&sem);
    return NULL;
}

void *threadfn2(void *p)
{
    launch(&t3, threadfn3, NULL);
    printf("2 %p\n", p);
    // notify threadfn4 we are done
    sem_post(&sem2);
    return NULL;
}

void *threadfn3(void *p)
{
    // wait for threadfn1 to finish
    sem_wait(&sem);
    launch(&t4, threadfn4, NULL);
    // wait for threadfn4 to finish
    sem_wait(&sem3);
    printf("3 %p\n", p);
    return NULL;
}

void *threadfn4(void *p)
{
    // wait for threadfn2 to finish
    sem_wait(&sem2);
    printf("4 %p\n", p);
    // notify threadfn3 we are done
    sem_post(&sem3);
    return NULL;
}

int main()
{
    sem_init(&sem, 0, 0);
    sem_init(&sem2, 0, 0);
    sem_init(&sem3, 0, 0);

    launch(&t1, threadfn1, NULL);
    printf("main\n");
    pthread_exit(NULL);
}

这似乎与在其 parent 或 parent 的 parent 结束之前结束的线程有关...最终我无法准确追踪导致数据竞争发生的原因。

还应该注意的是，在我的测试过程中出现了几次另一个数据竞争，最终我无法可靠地重现它，因为它只是偶尔无缘无故地出现。数据争用与我列出的相同，除了冲突似乎列出了更多的堆栈跟踪而不仅仅是“start_thread”，它看起来与上面第一个问题中报告的数据争用完全一样，除了它的底部列出 __libc_thread_freeres:

==15973== Possible data race during write of size 1 at 0x5C27697 by thread #4
==15973== Locks held: none
==15973==    at 0x4C3A3CC: mempcpy (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==15973==    by 0x401375F: _dl_allocate_tls_init (dl-tls.c:515)
==15973==    by 0x4E47CED: get_cached_stack (allocatestack.c:254)
==15973==    by 0x4E47CED: allocate_stack (allocatestack.c:501)
==15973==    by 0x4E47CED: pthread_create@@GLIBC_2.2.5 (pthread_create.c:539)
==15973==    by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==15973==    by 0x400832: launch (test3.c:22)
==15973==    by 0x4008FC: threadfn3 (test3.c:48)
==15973==    by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==15973==    by 0x4E476B9: start_thread (pthread_create.c:333)
==15973== 
==15973== This conflicts with a previous read of size 1 by thread #2
==15973== Locks held: none
==15973==    at 0x51C10B1: res_thread_freeres (in /lib/x86_64-linux-gnu/libc-2.19.so)
==15973==    by 0x51C1061: __libc_thread_freeres (in /lib/x86_64-linux-gnu/libc-2.19.so)
==15973==    by 0x4E45199: start_thread (pthread_create.c:329)
==15973==    by 0x515547C: clone (clone.S:111)

不，我不能加入线程，这对我们出现问题的软件不起作用

更新:我一直在进行一些测试，并成功地生成了另一个示例，该示例以更少的代码引起了问题。如果您只是启动线程并在循环中分离它们，则会导致数据竞争。

#include <pthread.h>
#include <stdio.h>

// seems we only need 3 threads to cause the problem
#define NUM_THREADS 3

pthread_t t1[NUM_THREADS] = {0};

void launch(pthread_t *t, void *(*fn)(void *), void *arg)
{
    pthread_create(t,NULL,fn,arg);
    pthread_detach(*t);
}

void *threadfn(void *p)
{
    return NULL;
}

int main()
{
    int i = NUM_THREADS;
    while (i-- > 0) {
        launch(t1 + i, threadfn, NULL);
    }
    return 0;
}

更新 2: 我发现，如果您在之前启动所有线程并分离其中任何一个，这似乎可以防止出现竞争条件。请参阅以下不会生成竞争条件的代码块:

#include <pthread.h>

#define NUM_THREADS 3

pthread_t t1[NUM_THREADS] = {0};

void launch(pthread_t *t, void *(*fn)(void *), void *arg)
{
    pthread_create(t,NULL,fn,arg);
}

void *threadfn(void *p)
{
    return NULL;
}

int main()
{
    int i;
    for (i = 0; i < NUM_THREADS; ++i) {
        launch(t1 + i, threadfn, NULL);
    }
    for (i = 0; i < NUM_THREADS; ++i) {
        pthread_detach(t1[i]);
    }
    pthread_exit(NULL);
}

如果在任何 pthread_detach() 调用之后添加另一个 pthread_create() 调用，则竞争条件会重新出现。这让我觉得不可能在不引起数据竞争的情况下使用 pthread_detach() 并随后使用 pthread_create()。

最佳答案

最后我只是重组了所有东西以便我可以加入我的线程，我真的不明白分离线程如何在不导致这种数据竞争的情况下工作。

关于c - 来自 helgrind 的分离 pthread 数据竞争，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57877228/

文章推荐： html - Bootstrap 3 中的列 + 旋转木马高度不相等的问题

文章推荐： java - Wildfly 与 Tomcat 上的 fastxml.jackson 版本

文章推荐： c++ - mingw32-g++.exe : error: (invalid): No such file or directory

c - Helgrind 在运行时停止程序
Helgrind 在运行时卡住。必须使用 CTRL+C (SIGINT) 才能退出运行。我编写了一个可以正确执行的程序，如果以通常的方式运行而不使用 --tool=helgrind，valgrind
c++ - helgrind 报告了使用单例和它的构造函数之间可能存在的竞争
如前所述there , Meyer 的单例在 C++11 中是线程安全的。所以我希望这段代码没问题: #include #include struct key_type { int va
c++ - Helgrind 和 atomic_flag
我在 cplusplus.com 尝试了使用 atomic_flag 的基本示例. Valgrind 的 Helgrind 工具报告 164 errors from 28 contexts (supp
c++ - 如何避免 Helgrind 的误报？
我的线程同步“风格”似乎正在摆脱 helgrind。这是一个重现问题的简单程序: #include #include #include int main() { std::atomic
c++ - valgrind/helgrind 在压力测试中被杀死
我正在使用 pthreads 在 C++ 中的 Linux 上制作 Web 服务器。我用 valgrind 测试了它是否存在泄漏和内存问题 - 都已修复。我用 helgrind 测试了它的线程问题 -
c++ - 为什么这段代码在 valgrind (helgrind) 下失败？
**已解决:在我类(class)的构造函数中，我有一个信号量的构造与线程的构造竞争，我希望先创建信号量，然后再创建线程。对我有用的解决方案是首先在基类中创建信号量，这样我就可以在派生类中依赖它。 **
c++ - valgrind 中的 helgrind 提示简单的互斥锁
我正在调试一些线程代码，并且正在使用 valgrind --tool=helgrind，出于某种原因，helgrind 不喜欢下面的简单示例。在我启动一个线程之前，我锁定了互斥体。在线程结束时，我将
c++ - 适用于 Windows 的 Helgrind？
Helgrind is a Valgrind tool for detecting synchronisation errors in C, C++ and Fortran programs that
c++ - 你如何在 macOS 上使用 Valgrind/Helgrind？
我正在 macOS 机器上学习 C，在让 Valgrind 工作方面遇到很多问题，尤其是线程和 Helgrind。看起来没有任何支持，这让我想知道是否: 没有人使用 macOS 开发 C/C++。人
c - Valgrind/Helgrind 错误地将 TTAS 模式报告为种族
我想我发现了 Helgrind 工具返回的相当广泛的误报。也许这已在其他地方记录下来，但 Helgrind 工具似乎总是会错误地检测 Test and Test-And-Set pattern作为误报
c++ - std::locale 导致 Helgrind 出错
在使用 Helgrind 分析我的程序时，我注意到我遇到了很多类似于以下的错误: ==8347== Possible data race during read of size 4 at 0x53C4
c - 来自 helgrind 的分离 pthread 数据竞争
我有一个更大的多线程软件(专有且无法共享)报告来自 helgrind 的数据争用(请参阅下面的数据争用)。我不能分享这个软件，但我设计了一些测试来演示比赛。与有问题的实际软件的竞争: ==7746=
c - 我应该使用 Helgrind 还是 DRD 进行线程错误检测？
看起来像Valgrind有两个工具都可以进行线程错误检测:Helgrind和 DRD .这些工具非常相似。我的主要问题是:我什么时候应该使用一个而不是另一个来检查我的多线程代码？更广泛地说，为什么
c++11 - helgrind 不检测 std::mutex 的递归锁定
我观察到 helgrind 不会检测非递归 c++11 std::mutex 上的递归锁。但是，在使用 pthread_mutex_lock 时会检测到该问题。两个简单的测试用例来演示问题: //
c - 为什么 Helgrind 显示 "lock order violated"错误消息？
请看下面的代码 #include #include #include #include pthread_mutex_t g = PTHREAD_MUTE
c++ - 一种检测滥用 POSIX pthreads API 而非 Helgrind 的工具
出于某种原因，我无法使用 Helgrind 来检测 POSIX pthreads API 的滥用(例如，解锁非锁定互斥锁、释放包含锁定互斥锁的内存等)。我试图找到另一个工具，但实际上失败了。正如我所发
c++ - 为什么 valgrind(helgrind) 在我的线程结构上调用虚拟函数时生成 "Possible Data Races"
当我开始学习 valgrind(helgrind) 工具时，我遇到了一个我未能解决的棘手问题。简单地说，一个用户定义的线程类是用一个虚拟函数创建的，该虚拟函数将被线程的入口例程调用。如果是这种情况，
linux - Valgrind、Helgrind 使用 >90% 的 CPU 并且不产生结果
我在脚本中的程序上运行 Valgrind 的 Helgrind 工具。这是脚本的相关部分:(我只写了第一行) sudo valgrind --tool=helgrind ./core-lin
c++ - std::thread 的 drd 和 helgrind 支持的当前状态
当我将我的代码转换为 C++11 时，我非常想将我的 pthread 代码转换为 std::thread。但是，我似乎在 drd 和 helgrind 中的非常简单的程序中遇到了错误的竞争条件。 #i
c++ - 无法让 Helgrind/DRD 使用 C++11 线程
我在让 Helgrind 和 DRD 使用 g++ 和 C++11 线程时遇到问题。我的设置: - RedHad Linux 2.6 -克++ 4.7.2 - Valgrind 3.7.0 我试过贴

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

c - 来自 helgrind 的分离 pthread 数据竞争