concurrency - N>1 goroutines 的不同结果(在 N>1 Cpu :s). 为什么？-6ren

concurrency - N>1 goroutines 的不同结果(在 N>1 Cpu :s). 为什么？

转载作者：IT王子更新时间：2023-10-29 01:43:25

我有一个测试程序，在多个 Cpu (Goroutines = Cpus) 上执行多个 goroutine 时会给出不同的结果。 “测试”是关于使用 channel 同步 goroutines，程序本身计算字符串中字符的出现次数。它在一个 Cpu/一个 goroutine 上产生一致的结果。

请参阅 playground 上的代码示例(注意:在本地计算机上运行以在多核上执行，并观察结果数字的变化):http://play.golang.org/p/PT5jeCKgBv .

代码摘要:该程序计算 (DNA) 字符串中 4 个不同字符(A、T、G、C)的出现次数。

问题:在多个 Cpu(goroutine)上执行时，结果(出现 n 个字符)会发生变化。为什么？

描述:

goroutine 将工作 (SpawnWork) 作为字符串发送给 Workers。设置人工字符串输入数据(硬编码字符串被复制 n 次)。
Goroutine Workers (Worker) 的创建数量等于 Cpu 的数量。
Workers 检查字符串中的每个字符并计算 A、T 并发送求和到一个 channel ，G，C 计数到另一个 channel 。
SpawnWork 关闭 workstring channel 以控制 Worker(它使用范围消耗字符串，当输入 channel 被 SpawnWork 关闭时退出)。
当 Workers 消耗完其范围(字符)后，它会在退出 channel 上发送退出信号 (quit <- true)。这些“脉冲”将出现 Cpu 次数(Cpu 计数 = goroutines 计数)。
Main (select) 循环将在收到退出的 Cpu-count 数时退出信号。
Main 函数打印出现的字符(A、T、G、C)的摘要。

简化代码:

1.“Worker”(goroutines)计算行中的字符数:

func Worker(inCh chan *[]byte, resA chan<- *int, resB chan<- *int, quit chan bool) {
    //for p_ch := range inCh {
    for {
        p_ch, ok := <-inCh // similar to range
        if ok {
            ch := *p_ch
            for i := 0; i < len(ch); i++ {
                if ch[i] == 'A' || ch[i] == 'T' {        // Count A:s and T:s
                    at++
                } else if ch[i] == 'G' || ch[i] == 'C' { // Count G:s and C:s
                    gc++
                }
            }
            resA <- &at  // Send line results on separate channels
            resB <- &gc  // Send line results on separate channels
        } else {
            quit <- true // Indicate that we're all done
            break
        }
    }
}

2. 向 worker 生成工作(字符串):

func SpawnWork(inStr chan<- *[]byte, quit chan bool) {
    // Artificial input data
    StringData :=
        "NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN\n" +
        "NTGAGAAATATGCTTTCTACTTTTTTGTTTAATTTGAACTTGAAAACAAAACACACACAA\n" +
        "... etc\n" +
    // ...
    for scanner.Scan() {
        s := scanner.Bytes()
        if len(s) == 0 || s[0] == '>' {
            continue
        } else {
            i++
            inStr <- &s
        }
    }
    close(inStr) // Indicate (to Workers) that there's no more strings coming.
}

3.主程序:

func main() {
    // Count Cpus, and count down in final select clause
    CpuCnt := runtime.NumCPU() 
    runtime.GOMAXPROCS(CpuCnt)
    // Make channels
    resChA := make(chan *int)
    resChB := make(chan *int)
    quit := make(chan bool)
    inStr := make(chan *[]byte)

    // Set up Workers ( n = Cpu )
    for i := 0; i < CpuCnt; i++ {
        go Worker(inStr, resChA, resChB, quit)
    }
    // Send lines to Workers
    go SpawnWork(inStr, quit)

    // Count the number of "A","T" & "G","C" per line 
    // (comes in here as ints per row, on separate channels (at and gt))
    for {
        select {
        case tmp_at := <-resChA:
            tmp_gc := <-resChB // Ch A and B go in pairs anyway
            A += *tmp_at       // sum of A's and T's
            B += *tmp_gc       // sum of G's and C's
        case <-quit:
            // Each goroutine sends "quit" signals when it's done. Since 
            // the number of goroutines equals the Cpu counter, we count 
            // down each time a goroutine tells us it's done (quit at 0):
            CpuCnt--
            if CpuCnt == 0 { // When all goroutines are done then we're done.
                goto out     
            }
        }
    }
out:
    // Print report to screen
}

为什么只有在单个 cpu/goroutine 上执行时，这段代码才会始终如一地计数？也就是说， channel 似乎没有同步，或者主循环在所有 goroutine 完成之前强行退出？挠头。

(同样:在 Playground 上查看/运行完整代码:http://play.golang.org/p/PT5jeCKgBv)

//罗尔夫·兰帕

最佳答案

这是一个工作版本，无论使用多少 cpu，它始终产生相同的结果。

这是我做的

删除 *int 的传递 - 在 channel 中传递非常活泼!
删除 *[]byte 的传递 - 毫无意义，因为 slice 无论如何都是引用类型
在将 slice 放入 channel 之前复制 slice - slice 指向同一内存导致竞争
修复 Worker 中 at 和 gc 的初始化 - 它们在错误的位置 - 这是导致结果差异的主要原因
使用sync.WaitGroup用于同步和 channel 关闭()

我使用了 -race parameter of go build查找并修复数据竞争。

package main

import (
    "bufio"
    "fmt"
    "runtime"
    "strings"
    "sync"
)

func Worker(inCh chan []byte, resA chan<- int, resB chan<- int, wg *sync.WaitGroup) {
    defer wg.Done()
    fmt.Println("Worker started...")
    for ch := range inCh {
        at := 0
        gc := 0
        for i := 0; i < len(ch); i++ {
            if ch[i] == 'A' || ch[i] == 'T' {
                at++
            } else if ch[i] == 'G' || ch[i] == 'C' {
                gc++
            }
        }
        resA <- at
        resB <- gc
    }

}

func SpawnWork(inStr chan<- []byte) {
    fmt.Println("Spawning work:")
    // An artificial input source.
    StringData :=
        "NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN\n" +
            "NTGAGAAATATGCTTTCTACTTTTTTGTTTAATTTGAACTTGAAAACAAAACACACACAA\n" +
            "CTTCCCAATTGGATTAGACTATTAACATTTCAGAAAGGATGTAAGAAAGGACTAGAGAGA\n" +
            "TATACTTAATGTTTTTAGTTTTTTAAACTTTACAAACTTAATACTGTCATTCTGTTGTTC\n" +
            "AGTTAACATCCCTGAATCCTAAATTTCTTCAGATTCTAAAACAAAAAGTTCCAGATGATT\n" +
            "TTATATTACACTATTTACTTAATGGTACTTAAATCCTCATTNNNNNNNNCAGTACGGTTG\n" +
            "TTAAATANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN\n" +
            "NNNNNNNCTTCAGAAATAAGTATACTGCAATCTGATTCCGGGAAATATTTAGGTTCATAA\n"
    // Expand data n times
    tmp := StringData
    for n := 0; n < 1000; n++ {
        StringData = StringData + tmp
    }
    scanner := bufio.NewScanner(strings.NewReader(StringData))
    scanner.Split(bufio.ScanLines)

    var i int
    for scanner.Scan() {
        s := scanner.Bytes()
        if len(s) == 0 || s[0] == '>' {
            continue
        } else {
            i++
            s_copy := append([]byte(nil), s...)
            inStr <- s_copy
        }
    }
    close(inStr)
}

func main() {
    CpuCnt := runtime.NumCPU() // Count down in select clause
    CpuOut := CpuCnt           // Save for print report
    runtime.GOMAXPROCS(CpuCnt)
    fmt.Printf("Processors: %d\n", CpuCnt)

    resChA := make(chan int)
    resChB := make(chan int)
    inStr := make(chan []byte)

    fmt.Println("Spawning workers:")
    var wg sync.WaitGroup
    for i := 0; i < CpuCnt; i++ {
        wg.Add(1)
        go Worker(inStr, resChA, resChB, &wg)
    }
    fmt.Println("Spawning work:")
    go func() {
        SpawnWork(inStr)
        wg.Wait()
        close(resChA)
        close(resChB)
    }()

    A := 0
    B := 0
    LineCnt := 0
    for tmp_at := range resChA {
        tmp_gc := <-resChB // Theese go together anyway
        A += tmp_at
        B += tmp_gc
        LineCnt++
    }

    if !(A+B > 0) {
        fmt.Println("No A/B was found!")
    } else {
        ABFraction := float32(B) / float32(A+B)
        fmt.Println("\n----------------------------")
        fmt.Printf("Cpu's  : %d\n", CpuOut)
        fmt.Printf("Lines  : %d\n", LineCnt)
        fmt.Printf("A+B    : %d\n", A+B)
        fmt.Printf("A      : %d\n", A)
        fmt.Printf("B      : %d\n", A)
        fmt.Printf("AB frac: %v\n", ABFraction*100)
        fmt.Println("----------------------------")
    }
}

关于concurrency - N>1 goroutines 的不同结果(在 N>1 Cpu :s). 为什么？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/17098722/

文章推荐： sql - 来自 SQL 列 : Cannot call methods on nvarchar(max) 的 XML

文章推荐： xml - 通过 VBA 将 XML 加载到 Excel

java - s = s + s 和 s += s 之间的区别
这个问题在这里已经有了答案: Why don't Java's +=, -=, *=, /= compound assignment operators require casting? (11 个
c# - ORA-21500 : internal error code, 参数 : [%s], [%s]、[%s]、[%s]、[%s]、[%s]、[%s]、[%s]
我搜索了很多，但没有一个链接能帮助我解决这个问题。我得到了 ORA-21500: internal error code, arguments: [%s], [%s], [%s], [%s], [%s
regex - 正则表达式中的 `(\S.*\S)` 和 `^\s*(.*)\s*$` 有什么区别？
我正在做 RegexOne 正则表达式教程，它有一个 question关于编写正则表达式以删除不必要的空格。教程中提供的解决方案是 We can just skip all the starting
javascript - |\s 的目的/作用是什么？在 ([\s\S]+|\s?)
([\s\S]+|\s?) 中 |\s? 的目的或作用是什么？如果没有它，表达式会不会与 ([\s\S]+) 相同？最佳答案这不是完全相同的。 ([\s\S]+|\s?) 会匹配空字符串，而 ([
java - 这个正则表达式有一组还是两组？ "^\\s*(.*?)\\s+-\\s+' (.* )'\\s*$"
这个正则表达式有一组还是两组？我正在尝试使用第二组访问 bookTitle 但出现错误: Pattern pattern = Pattern.compile("^\\s*(.*?)\\s+-\\s+
c - 这个迭代如何工作 : for(++s ; *s;++s)
在 C 中给定一个字符串指针 s，下面的迭代会做什么？即它以什么方式遍历字符串？ for (++s ; *s; ++s); 最佳答案 for (++s ; *s;++s) 表示将指针 s 递增到字符
javascript - 正则表达式 '\s+-\s*|\s*-\s+' 无法正常工作
我正在用一个 node.js 应用程序解析一个大列表并有这段代码 sizeCode = dbfr.CN_DESC.split('\s+-\s*|\s*-\s+') 这似乎不起作用，因为它返回了 [ '
c - 查找字符串结尾 : *s++ VS *s then s++
我正在编写一个简单的字符串连接程序。该程序按照我发布的方式运行。但是，我首先使用以下代码编写它来查找字符串的结尾: while (*s++) ; 但是，这个方法并没有奏效。我传递给它的字符串
java - 正则表达式 (?<=[\\S])[\\S]*\\s* 的作用是什么？
这个问题已经有答案了: What does (?和aramchand来自Mohandas Karamchand G 因此，在使用这些匹配来分割字符串后，您最终会得到 {"M", "K", "G"} 注
java - 映射到列表
~~我正在尝试转换 Map到 List使用 lambda。本质上，我想将键和值与 '=' 连接起来之间。这看起来微不足道，但我找不到如何去做。例如 Map map = new HashMap<>();~~

C 指针 : difference between while(*s++) { ;} and while(*s) { s++;}
我正在经历 K & R，并且在递增指针时遇到困难。练习 5.3(第 107 页)要求您使用指针编写一个 strcat 函数。在伪代码中，该函数执行以下操作: 将 2 个字符串作为输入。找到字符串

c++ - 在 S s = S() 中是否保证不会创建临时文件？
在下面的代码中，pS 和 s.pS 在最后一行是否保证相等？也就是说，在语句S s = S();中，是否可以确定不会构造一个临时的S？ #include using namespace std; s

c# - 关于将类型 'int' 隐式转换为 'char' ，为什么 `s[i] += s[j]` 和 `s[i] = s[i]+s[j] ` 不同
演示示例代码: public void ReverseString(char[] s) { for(int i = 0, j = s.Length-1; i < j; i++, j--){

PowerShell New-TimeSpan 友好地显示为天(s)小时(s)分钟(s)秒(s)
我一直在寻找类似于 .NET examples 中的示例的 PowerShell 脚本.取一个 New-TimeSpan 并显示为 1 天 2 小时 3 分钟 4 秒。排除其零的地方，在需要的地方添加

python - 对于 string_list : s = func(s) can't change string s 中的 s
def func(s): s = s + " is corrected" return s string_list = ["She", "He"] for s in string_li

python - 折叠和 (lambda s : "". join(s.split())) 或 (lambda s: s)
我是 python 的新手。当我在互联网上搜索 lambda 时。我在 lambda_functions 中找到了这个声明. processFunc = collapse and (lambda s:

regex - 如何为包含 "a"s、 "b"s 和 "c"s 但不超过 2 "b"s 和 3 "c"s 的所有字符串编写简洁的正则表达式
我最近开始学习正则表达式，并试图为上面的问题写一个正则表达式。如果限制只放在一个字母上(例如不超过 2 个“b”)，这并不困难。那么答案就是:a* c*(b|ε)a* c*(b|ε)a* c* 但是

python - npm 安装错误导入系统；打印 "%s.%s.%s"
当我运行 npm install 时出现以下错误，但我无法修复它。我试过:npm install -g windows-build-tools 也没有修复这个错误 ERR! configure

haskell - 在 Haskell 中将 "->"s 替换为 "→"s，将 "=>"s 替换为 "⇒"s 等等
有很多有趣的haskell网上可以找到片段。 This post可以在 this (awesome) Stack Overflow question 下找到. The author写道: discou

regex - 在Perl中，s/^\s +//和s/\s + $//有什么区别？
我知道以下三行代码旨在将字符串提取到$ value中并将其存储在$ header中。但是我不知道$value =~ s/^\s+//;和$value =~ s/\s+$//;之间有什么区别。 $val

IT王子

个人简介
我是一名优秀的程序员,十分优秀！

作者热门文章

r - 以节省内存的方式增长 data.frame

ruby-on-rails - ruby/ruby on rails 内存泄漏检测

android - 无法解析导入android.support.v7.app

UNIX 域套接字与共享内存(映射文件)

滴滴打车优惠券免费领取

全站热门文章

SimpleAIAgent：使用免费的glm-4-flash即可开始构建简单的AIAgent应用

.net到底行不行！2000人在线的客服系统真实屏录演示（附技术详解）

JavaWeb拾遗

C#开源浏览器性能提升，体验Chrome级速度

使用.NET并行任务库(TPL)与并行Linq(PLINQ)充分利用多核性能

keycloak~关于授权码认证中的scope的实践

这才是批量update的正确姿势！

SelMatch：最新数据集蒸馏，仅用5%训练数据也是可以的|ICML'24

ArgoWorkflow教程(五)---Workflow的多种触发模式：手动、定时任务与事件触发

从零开始学机器学习——了解回归

首页

博学

6Ren·AI

商城

concurrency - N>1 goroutines 的不同结果(在 N>1 Cpu :s). 为什么？