gpt4 book ai didi

iterator - 减少 Julia 中生成器的内存分配

转载 作者:行者123 更新时间:2023-12-04 13:29:14 25 4
gpt4 key购买 nike

我正在尝试减少代码中内部循环的内存分配。在未按预期工作的部分下方。

using Random 
using StatsBase
using BenchmarkTools
using Distributions

a_dist = Distributions.DiscreteUniform(1, 99)
v_dist = Distributions.DiscreteUniform(1, 2)
population_size = 10000
population = [rand(a_dist, population_size) rand(v_dist, population_size)]


find_all_it3(f::Function, A) = (p[2] for p in eachrow(A) if f(p[1]))

@btime begin
c_pool = find_all_it3(x -> (x < 5), population)
c_pool_dict = countmap(c_pool, alg=:dict)
end


@btime begin
c_pool_indexes = findall(x -> (x < 5) , view(population, :, 1))
c_pool_dict = countmap(population[c_pool_indexes, 2], alg=:dict)
end
我希望生成器 (find_all_it3) 不需要分配太多内存。
但是根据 btime输出似乎每个循环都有一个分配。
  98.040 μs (10006 allocations: 625.64 KiB)
18.894 μs (18 allocations: 11.95 KiB)
现在在我的场景中 findall 的速度和分配最终成为一个问题,因此我试图通过生成器/迭代器找到更好的替代方案,以便减少分配;有没有办法做到这一点?有没有可以考虑的选项?

最佳答案

我没有解释,但这里是我做的一些测试的结果

  • 最好的时间是通过 view(population, :, 1) .< 5 获得的( test4 )
  • 使用 broadcast!稍微减少分配( test5 )
  • 减少分配的最好方法是做你自己的循环( test6 )
  • using BenchmarkTools
    using StatsBase

    population_size = 10000
    population = [rand(1:99, population_size) rand(1:2, population_size)]

    find_all_it(f::Function, A) = (p[2] for p in eachrow(A) if f(p[1]))

    function test1(population)
    c_pool = find_all_it(x -> x < 5, population)
    c_pool_dict = countmap(c_pool, alg=:dict)
    end

    function test3(population)
    c_pool_indexes = findall(x -> x < 5, view(population, :, 1))
    c_pool_dict = countmap(view(population,c_pool_indexes, 2), alg=:dict)
    end

    function test4(population)
    c_pool_indexes = view(population, :, 1) .< 5
    c_pool_dict = countmap(view(population,c_pool_indexes, 2), alg=:dict)
    end

    function test5(c_pool_indexes, population)
    broadcast!(<, c_pool_indexes, view(population, :, 1), 5)
    c_pool_dict = countmap(view(population,c_pool_indexes, 2), alg=:dict)
    end

    function test6(population)
    d = Dict{Int,Int}()
    for i in eachindex(view(population, :, 1))
    if population[i, 1] < 5
    d[population[i,2]] = 1 + get(d,population[i,2],0)
    end
    end
    return d
    end

    julia> @btime test1(population);
    68.200 μs (10004 allocations: 625.59 KiB)

    julia> @btime test3(population);
    14.800 μs (14 allocations: 9.00 KiB)

    julia> @btime test4(population);
    7.250 μs (8 allocations: 9.33 KiB)

    julia> temp = zeros(Bool, population_size);

    julia> @btime test5(temp, population);
    16.599 μs (5 allocations: 3.78 KiB)

    julia> @btime test6(population);
    11.299 μs (4 allocations: 608 bytes)

    关于iterator - 减少 Julia 中生成器的内存分配,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66102793/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com