c++ - 避免并行递归异步算法中的递归模板实例化溢出-6ren

c++ - 避免并行递归异步算法中的递归模板实例化溢出

转载作者：塔克拉玛干更新时间：2023-11-02 23:39:45

27

4

这个问题通过一个简化的例子更容易解释(因为我的真实情况远非“最小”):给定...

template <typename T>
void post_in_thread_pool(T&& f)

...函数模板，我想创建一个具有树状递归结构的并行异步算法。我将使用 std::count_if 编写以下结构的示例作为占位符。我将要使用的策略如下:

如果我检查的范围长度小于64 , 我将回到顺序 std::count_if功能。 (0)
如果它大于或等于 64 ，我将在线程池中生成一个作业，该作业在范围的左半部分递归，并在当前线程上计算范围的右半部分。 (1)
- 我将使用原子共享 int “等待”计算两半。 (2)
- 我将使用原子共享 int累积部分结果。 (3)

简化代码:

auto async_count_if(auto begin, auto end, auto predicate, auto continuation)
{
    // (0) Base case:  
    if(end - begin < 64)
    {
        continuation(std::count_if(begin, end, predicate));
        return;
    }

    // (1) Recursive case:
    auto counter = make_shared<atomic<int>>(2); // (2)
    auto cleanup = [=, accumulator = make_shared<atomic<int>>(0) /*(3)*/]
                   (int partial_result)
    {
        *accumulator += partial_result; 

        if(--*counter == 0)
        {
            continuation(*accumulator);
        }
    };

    const auto mid = std::next(i_begin, sz / 2);                

    post_in_thread_pool([=]
    {
        async_count_if(i_begin, mid, predicate, cleanup);
    });

    async_count_if(mid, i_end, predicate, cleanup);
}

然后可以按如下方式使用代码:

std::vector<int> v(512);
std::iota(std::begin(v), std::end(v), 0);

async_count_if{}(std::begin(v), std::end(v), 
/*    predicate */ [](auto x){ return x < 256; }, 
/* continuation */ [](auto res){ std::cout << res << std::endl; });

上面代码中的问题是auto cleanup .自 auto cleanup 的每个实例都将被推导为唯一类型lambda，并且自 cleanup捕获 cont按值...由于递归，将在编译时计算无限大的嵌套 lambda 类型，导致以下错误:

fatal error: recursive template instantiation exceeded maximum depth of 1024

wandbox example

从概念上讲，您可以大致像这样想像正在构建的类型:

cont                                // user-provided continuation
cleanup0<cont>                      // recursive step 0
cleanup1<cleanup0<cont>>            // recursive step 1
cleanup2<cleanup1<cleanup0<cont>>>  // recursive step 2
// ...

(!):记住 async_count_if 只是一个例子，展示我真实情况的“树状”递归结构。我知道异步 count_if可以使用单个原子计数器和 sz / 64 轻松实现任务。

我想避免错误，最小化任何可能的运行时或内存开销。

一个可能的解决方案是使用 std::function<void(int)> cleanup ，它允许代码正确编译和运行，但会产生次优汇编并引入额外的动态分配。 wandbox example
- 另一种可能的解决方案是使用 std::size_t模板参数+特化人为限制async_count_if::operator()的递归深度——不幸的是，这会使二进制大小膨胀并且非常不优雅。

令我困扰的是，当我调用 async_count_if 时，我知道范围的大小。 :是std::distance(i_begin, i_end) .如果我知道范围的大小，我还可以推断出所需计数器和延续的数量:(2^k - 1) , 其中k是递归树的深度。

因此，我认为在 async_count_if 的第一次调用中应该有一种预先计算“控制结构”的方法并通过引用将其传递给递归调用。这个“控制结构”可以包含足够的空间用于(2^k - 1)原子计数器和 (2^k - 1)清理/延续功能。

不幸的是，我找不到一个干净的方法来实现这个，并决定在这里发布一个问题，因为在开发异步并行递归算法时这个问题应该很常见。

在不引入不必要的开销的情况下处理这个问题的优雅方法是什么？

最佳答案

我肯定遗漏了一些非常明显的东西，但为什么你需要多个计数器和结构？你可以预先计算迭代的总数(如果你知道你的基本情况)并在所有迭代中与累加器一起共享它，a la(不得不稍微修改你的简化代码):

#include <algorithm>
#include <memory>
#include <vector>
#include <iostream>
#include <numeric>
#include <future>

using namespace std;

template <class T>
auto post_in_thread_pool(T&& work)
{
    std::async(std::launch::async, work);
}

template <class It, class Pred, class Cont>
auto async_count_if(It begin, It end, Pred predicate, Cont continuation)
{
    // (0) Base case:  
    if(end - begin <= 64)
    {
        continuation(std::count_if(begin, end, predicate));
        return;
    }

    const auto sz = std::distance(begin, end);
    const auto mid = std::next(begin, sz / 2);                

    post_in_thread_pool([=]
    {
         async_count_if(begin, mid, predicate, continuation);
    });

    async_count_if(mid, end, predicate, continuation);
}

template <class It, class Pred, class Cont>
auto async_count_if_facade(It begin, It end, Pred predicate, Cont continuation)
{
    // (1) Recursive case:
    const auto sz = std::distance(begin, end);
    auto counter = make_shared<atomic<int>>(sz / 64); // (fix this for mod 64 !=0 cases)
    auto cleanup = [=, accumulator = make_shared<atomic<int>>(0) /*(3)*/]
                   (int partial_result)
    {
        *accumulator += partial_result; 

        if(--*counter == 0)
        {
            continuation(*accumulator);
        }
    };

    return async_count_if(begin, end, predicate, cleanup);
}

int main ()
{
    std::vector<int> v(1024);
    std::iota(std::begin(v), std::end(v), 0);

    async_count_if_facade(std::begin(v), std::end(v), 
    /*    predicate */ [](auto x){ return x > 1000; }, 
    /* continuation */ [](const auto& res){ std::cout << res << std::endl; });
}

一些 demo

关于c++ - 避免并行递归异步算法中的递归模板实例化溢出，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41572863/

27

4

0

文章推荐： c++ - Ctags 错误 "Is a directory"

文章推荐： android - 如何将位图图像设置为按钮背景图像

文章推荐： android - Phonegap 视频捕捉崩溃

文章推荐： c++ - 这是优化器的怪癖还是语言规则禁止优化的结果？

perl - 避免 Mojolicious 异步行为？避免 "AnyEvent::CondVar: recursive blocking wait attempted"
我们已经有一个使用 AnyEvent 的库。它在内部使用 AnyEvent，并最终返回一个值(同步 - 不使用回调)。有什么方法可以将这个库与 Mojolicious 一起使用吗？它的作用如下: #
JAXB 避免 JAXBElement
我想从 XSD 文件生成带有 JAXB 的 Java 类。问题是，我总是得到一些像这样的类(删除了命名空间): public static class Action { @X
javascript - 避免/禁用自动跳转到输入字段
我有一个关于 html 输入标签或 primefaces p:input 的问题。为什么光标总是自动跳转到输入字段。我的页面高度很高，因此您需要向下滚动。输入字段位于页面末尾，光标自动跳转(加载)到页
oop - 避免 if 语句
我今天在考虑面向对象设计，我想知道是否应该避免 if 语句。我的想法是，在任何需要 if 语句的情况下，您都可以简单地创建两个实现相同方法的对象。这两个方法实现只是原始 if 语句的两个可能的分支。
java - 避免 NullPointerException
String graphNameUsed = graphName.getName(); if (graphType.equals("All") || graphType.equals(
mysql - 避免/删除表中的重复行
我有一张友谊 table CREATE TABLE IF NOT EXISTS `friendList` ( `id` int(10) NOT NULL, `id_friend` int(10
c - 避免 if in 循环
上下文 Debian 64。Core 2 二人组。摆弄循环。我使用了同一循环的不同变体，但我希望尽可能避免条件分支。但是，即使我认为它也很难被击败。我考虑过 SSE 或位移位，但它仍然需要跳转(
java - 避免 OutOfMemoryError
我最近在 Java 中创建了一个方法来获取字符串的排列，但是当字符串太长时它会抛出这个错误:java.lang.OutOfMemoryError: Java heap space我确信该方法是有效的，
c++ - 避免 while (!is_eof)
我正在使用 (C++) 库，其中需要使用流初始化对象。库提供的示例代码使用此代码: // Declare the input stream HfstInputStream *in = NULL; tr
MySQL 避免 WHERE/AND 中的子查询重复
我有一个 SQL 查询，我在 WHERE 子句中使用子查询。然后我需要再次使用相同的子查询将其与不同的列进行比较。我假设没有办法在子查询之外访问“emp_education_list li”？我猜
android - 避免 NetworkOnMainThreadException
我了解到在 GUI 线程上不允许进行网络操作。对我来说还可以。但是为什么在 Dialog 按钮点击回调上使用这段代码仍然会产生 NetworkOnMainThreadException ？ new T
C++ 避免 if & 硬编码字符串
有没有办法避免在函数重定向中使用 if 和硬编码字符串，想法是接收一个字符串并调用适当的函数，可能使用模板/元编程.. #include #include void account() {
c - 避免 TIME_WAIT
我正在尝试避免客户端出现 TIME_WAIT。我连接然后设置 O_NONBLOCK 和 SO_REUSEADDR。我调用 read 直到它返回 0。当 read 返回 0 时，errno 也为 0。我
c++ - 避免/检测对导出文件的操纵
我正在开发 C++ Qt 应用程序。为了在应用程序或其连接的设备出现故障时帮助用户，程序导出所有内部设置并将它们存储在一个普通文件(目前为 csv)中。然后将此文件发送到公司(例如通过邮件)。为避免
java - 避免 instanceof
我有一组具有公共(public)父类(super class)的 POJO。这些存储在 superclass 类型的二维数组中。现在，我想从数组中获取一个对象并使用子类的方法。这意味着我必须将它们转
java - 避免 "for"语句中的空指针异常
在我的代码中，当 List 为 null 时，我通常使用这种方法来避免 for 语句中的 NullPointerException: if (myList != null && myList.size
c - 避免 TIME_WAIT
我正在尝试避免客户端出现 TIME_WAIT。我连接然后设置 O_NONBLOCK 和 SO_REUSEADDR。我调用 read 直到它返回 0。当 read 返回 0 时，errno 也为 0。我
c - 避免/减轻每次函数调用后返回值检查的痛苦的方法？
在不支持异常的语言和/或库中，许多/几乎所有函数都会返回一个值，指示其操作成功或失败 - 最著名的例子可能是 UN*X 系统调用，例如 open( ) 或 chdir()，或一些 libc 函数。无
R 按值选择，避免 NA
我尝试按值提取行。 col1 df$col1[col1 == "A"] [1] "A" NA 当然我只想要“A”。如何避免 R 选择 NA 值？顺便说一句，我认为这种行为非常危险，因为很多人都会陷入
R 避免 rowwise() 并寻找更快的替代方案
我想将两个向量合并到一个数据集中，并将其与函数 mutate 集成为 5 个新列到现有数据集中。这是我的示例代码: vector1% rowwise()%>% mutate(vector2|>

首页

博学

6Ren·AI

商城

c++ - 避免并行递归异步算法中的递归模板实例化溢出