c++ - 并发线程比单线程慢-6ren

c++ - 并发线程比单线程慢

转载作者：行者123 更新时间：2023-11-28 04:42:40

24

4

我一直在比较两种从矩阵中找到最大值的方法(如果它们是重复的，则在它们之间随机选择)，单线程与多线程。通常，假设我正确编码，多线程应该更快。因为它不是，它慢得多，我只能假设我做错了什么。有人可以指出我做错了什么吗？

注意:我知道我不应该使用 rand()，但出于这个目的，我觉得这样做没有那么多问题，我会在它正常工作后用 mt19937_64 替换它。

提前致谢!

double* RLPolicy::GetActionWithMaxQ(std::tuple<int, double*, int*, int*, int, double*>* state, long* selectedActionIndex, bool& isActionQZero)
{
    const bool useMultithreading = true;

    double* qIterator = Utilities::DiscretizeStateActionPairToStart(_world->GetCurrentStatePointer(), (long*)&(std::get<0>(*state)));

    // Represents the action-pointer for which Q-values are duplicated
    // Note: A shared_ptr is used instead of a unique_ptr since C++11 wont support unique_ptrs for pointers to pointers **
    static std::shared_ptr<double*> duplicatedQValues(new double*[*_world->GetActionsNumber()], std::default_delete<double*>());
    /*[](double** obj) {
    delete[] obj;
    });*/

    static double* const defaultAction = _actionsListing.get();// [0];
    double* actionOut = defaultAction; //default action
    static double** const duplicatedQsDefault = duplicatedQValues.get();

    if (!useMultithreading)
    {    
        const double* const qSectionEnd = qIterator + *_world->GetActionsNumber() - 1;

        double* largestValue = qIterator;
        int currentActionIterator = 0;

        long duplicatedIndex = -1;

        do {
            if (*qIterator > *largestValue)
            {
                largestValue = qIterator;
                actionOut = defaultAction + currentActionIterator;
                *selectedActionIndex = currentActionIterator;
                duplicatedIndex = -1;
            }
            // duplicated value, map it
            else if (*qIterator == *largestValue)
            {
                ++duplicatedIndex;
                *(duplicatedQsDefault + duplicatedIndex) = defaultAction + currentActionIterator;
            }
            ++currentActionIterator;
            ++qIterator;
        } while (qIterator != qSectionEnd);

        // If duped (equal) values are found, select among them randomly with equal probability
        if (duplicatedIndex >= 0)
        {
            *selectedActionIndex = (std::rand() % duplicatedIndex);
            actionOut = *(duplicatedQsDefault + *selectedActionIndex);
        }

        isActionQZero = *largestValue == 0;

        return actionOut;

    }
    else
    {
        static const long numberOfSections = 6;
        unsigned int actionsPerSection = *_world->GetActionsNumber() / numberOfSections;
        unsigned long currentSectionStart = 0;

        static double* actionsListing = _actionsListing.get();

        long currentFoundResult = FindActionWithMaxQInMatrixSection(qIterator, 0, actionsPerSection, duplicatedQsDefault, actionsListing);

        static std::vector<std::future<long>> maxActions;
        for (int i(0); i < numberOfSections - 1; ++i)
        {
            currentSectionStart += actionsPerSection;
            maxActions.push_back(std::async(&RLPolicy::FindActionWithMaxQInMatrixSection, std::ref(qIterator), currentSectionStart, std::ref(actionsPerSection), std::ref(duplicatedQsDefault), actionsListing));
        }

        long foundActionIndex;

        actionOut = actionsListing + currentFoundResult;

        for (auto &f : maxActions)
        {
            f.wait();

            foundActionIndex = f.get();

            if (actionOut == nullptr)
                actionOut = defaultAction;
            else if (*(actionsListing + foundActionIndex) > *actionOut)
                actionOut = actionsListing + foundActionIndex;
        }

        maxActions.clear();

        return actionOut;
    }
}

/*
    Deploy a thread to find the action with the highest Q-value for the provided Q-Matrix section.

    @return - The index of the action (on _actionListing) which contains the highest Q-value.
*/
long RLPolicy::FindActionWithMaxQInMatrixSection(double* qMatrix, long sectionStart, long sectionLength, double** dupListing, double* actionListing)
{
    double* const matrixSectionStart = qMatrix + sectionStart;
    double* const matrixSectionEnd = matrixSectionStart + sectionLength;
    double** duplicatedSectionStart = dupListing + sectionLength;

    static double* const defaultAction = actionListing;
    long maxValue = sectionLength;
    long maxActionIndex = 0;
    double* qIterator = matrixSectionStart;
    double* largestValue = matrixSectionStart;

    long currentActionIterator = 0;

    long duplicatedIndex = -1;

    do {
        if (*qIterator > *largestValue)
        {
            largestValue = qIterator;
            maxActionIndex = currentActionIterator;
            duplicatedIndex = -1;
        }
        // duplicated value, map it
        else if (*qIterator == *largestValue)
        {
            ++duplicatedIndex;
            *(duplicatedSectionStart + duplicatedIndex) = defaultAction + currentActionIterator;
        }
        ++currentActionIterator;
        ++qIterator;
    } while (qIterator != matrixSectionEnd);

    // If duped (equal) values are found, select among them randomly with equal probability
    if (duplicatedIndex >= 0)
    {
        maxActionIndex = (std::rand() % duplicatedIndex);
    }

    return maxActionIndex;
}

最佳答案

并行程序不一定比串行程序快；设置并行算法既有固定的时间成本也有可变的时间成本，对于小的和/或简单的问题，这种并行开销成本可能比整个串行算法的成本更大。并行开销的示例包括线程生成和同步、额外的内存复制和内存总线压力。在串行程序大约 2 微秒和并行程序大约 500 微秒的情况下，您的矩阵可能足够小，以至于设置并行算法的工作掩盖了解决矩阵问题的工作。

关于c++ - 并发线程比单线程慢，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49905094/

24

4

0

文章推荐： c++ - 根据它们在 C++ 中作为矩阵的位置从文件中读取数字

文章推荐： c++ - 在编译时逐步构建变量指针的 vector

文章推荐： c++ - SFML 自定义类未关闭

并发
我正在尝试在多线程环境中实现某种累积逻辑；我想知道没有 lock 和 synchronized 关键字是否有更好/更快的方法来做到这一点？以下是我当前的代码: public class Concurr
并发 - 实现信号量的监视器
我需要帮助构建一个实现信号量的监视器，简单的 C 示例就可以。这是为了证明可以在任何可以使用信号量的地方使用监视器。最佳答案如果您说允许使用互斥锁/condvars，请检查: #include
JQuery .each() 并发
我已经构建了一些返回部分产品目录的 ajax，并且我正在尝试将 xml 输出到文档中，到目前为止，这是我所拥有的: $("#catalog").append("Item NamePriceDe
mysql 并发
很抱歉，如果我的问题之前已经被问过，或者它太明显了，但我真的需要澄清这一点。感谢您的帮助。在多用户界面中，如果来自不同用户的相同事务同时到达服务器，会发生什么？我有下一张表: create tab
Java程序输出——并发
这可能是一个愚蠢的问题，但是这个程序的输出(它的方式)可以为零吗？ public class Test2{ int a = 0; AtomicInteger b = new Atomi
Http请求/并发？
假设我本地主机上的一个网站处理每个请求大约需要 3 秒。这很好，正如预期的那样(因为它在幕后进行了一些奇特的网络)。但是，如果我在选项卡(在 firefox 中)中打开相同的 url，然后同时重新加
MongoDB 并发
我对 MongoDB 的读锁定有点困惑。单个集合可以支持多少个并发读取操作？最佳答案如 tk 给出的链接中所写:http://www.mongodb.org/pages/viewpage.acti
并发，4个CUDA应用竞争获取GPU资源
如果有四个并发的 CUDA 应用程序在一个 GPU 中竞争资源会发生什么这样他们就可以将工作卸载到图形卡上了？ Cuda Programming Guide 3.1 提到那里某些方法是异步的: 内核
spark学习之并行度、并发、core数和分区的关系
👊上次的百度面试遇到了关于spark的并发数的问题，今天我们就来将这些问题都一并解决一下，图画的的有点丑，还行大家见谅，百度实习的问题我放在了下面的链接👇：链接: 2022百度大数据开发工程师实
multithreading - Groovy 并发
我对 Groovy 线程有疑问。我的任务是以某种方式翻译给定目录中的每个文件并将生成的输出放在其他目录中的文件中。我编写了以下代码，该代码有效: static def translateDir(
java - 并发:同步与锁定
Java中的同步和锁定有什么区别？最佳答案 synchronized是语言关键字；锁是对象。当一个方法或代码块被标记为同步时，您是说该方法或代码块必须先获得某个锁对象(可以在同步的语法中指定)才能
并发 RPC 服务器
我需要创建一个能够同时处理来自客户端的多个请求的并发 RPC 服务器。使用 rpcgen linux编译器(基于sun RPC)，不支持-A为并发服务器创建 stub 的选项。 (-A 选项在 so
Java 并发 - 这有效吗？
System.out.println("Enter the number of what you would like to do"); System.out.println("1 = Manuall
ipad - 并发 UIAlertControllers
我正在将我的应用程序移植到 iOS 8.0 并注意到 UIAlertView 已被弃用。所以我改变了使用 UIAlertController 的方法。这在大多数情况下都有效。除了，当我的应用程序打
java - 并发 - 条件同步方法
我正在逐行同时读取两个文本文件。我特别想做的是当lineCount在每个线程上都是相同的我想看看扫描仪当前正在读取的字符串。我环顾四周寻找可以实现的某些模式，例如 Compare and Swap
Java 并发 - 中断策略
我正在阅读 Java Concurrency in Practice .在章节中断政策部分取消和关闭它提到 A task should not assume anything about the
c++ - 有没有比下面更好的方法来使用C++并发？
我正在尝试学习线程，互斥等的基础知识。遵循here的文档和示例。在下面的代码中，我得到预期的输出。问题: 想确认我是否有任何陷阱？我们如何改善下面的代码？我的线程在哪一行尝试获取互斥锁或正在等待互斥
multithreading - 并发、并行和异步方法有什么区别？
并发是指两个任务在不同的线程上并行运行。但是，异步方法并行运行，但在同一个线程上。这是如何实现的？另外，并行性怎么样？这三个概念有什么区别？最佳答案并发和并行实际上与您正确推测的原理相同，两者都
java - 并发:使用非同步方法更改变量
以此ConcurrentDouble类定义为例: public class ConcurrentDouble { public double num = 0; public void subt
java - 并发/多线程何时有助于提高性能？
在得知并发确实增加了许多人的吞吐量后，我一直计划在项目中使用并发。现在我在多线程或并发方面还没有做太多工作，因此决定在实际项目中使用它之前学习并进行简单的概念验证。以下是我尝试过的两个示例: 1.

首页

博学

6Ren·AI

商城

c++ - 并发线程比单线程慢