gpt4 book ai didi

ruby - Ruby 2.2 中的垃圾收集器引发意想不到的 CoW

转载 作者:塔克拉玛干 更新时间:2023-11-03 01:29:22 26 4
gpt4 key购买 nike

当我 fork 我的进程时,如何防止 GC 引发写时复制?由于我在我的程序中遇到了一些内存问题(我的 60 核 0.5Tb 机器上的内存不足,即使是相当小的任务),我最近一直在分析 Ruby 中垃圾收集器的行为。对我来说,这确实限制了 ruby​​ 在多核服务器上运行程序的实用性。我想在这里展示我的实验和结果。

当垃圾收集器在 fork 期间运行时会出现此问题。我调查了三个案例来说明这个问题。

案例一:我们使用数组在内存中分配了很多对象(不超过20字节的字符串)。字符串是使用随机数和字符串格式创建的。当进程 fork 并且我们强制 GC 在子进程中运行时,所有共享内存都变为私有(private),导致初始内存重复。

案例2:我们使用数组在内存中分配了很多对象(字符串),但是字符串是使用rand.to_s 函数创建的,因此与前一种情况相比,我们删除了数据的格式。我们最终使用的内存量较少,大概是因为垃圾较少。当进程 fork 并且我们强制 GC 在子进程中运行时,只有部分内存变为私有(private)。我们有初始内存的复制,但程度较小。

情况 3:与之前相比,我们分配的对象更少,但对象更大,因此分配的内存量与之前的情况相同。当进程 fork 并且我们强制 GC 在子进程中运行时,所有内存保持共享,即没有内存重复。

我在这里粘贴了用于这些实验的 Ruby 代码。要在 case 之间切换,您只需要更改 memory_object 函数中的“option”值。代码在 Ubuntu 14.04 机器上使用 Ruby 2.2.2、2.2.1、2.1.3、2.1.5 和 1.9.3 进行了测试。

案例 1 的示例输出:

ruby version 2.2.2 
proces pid log priv_dirty shared_dirty
Parent 3897 post alloc 38 0
Parent 3897 4 fork 0 37
Child 3937 4 initial 0 37
Child 3937 8 empty GC 35 5

完全相同的代码是用 Python 编写的,在所有情况下,CoW 都运行良好。

案例 1 的示例输出:

python version 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2]
proces pid log priv_dirty shared_dirty
Parent 4308 post alloc 35 0
Parent 4308 4 fork 0 35
Child 4309 4 initial 0 35
Child 4309 10 empty GC 1 34

ruby 代码

$start_time=Time.new

# Monitor use of Resident and Virtual memory.
class Memory

shared_dirty = '.+?Shared_Dirty:\s+(\d+)'
priv_dirty = '.+?Private_Dirty:\s+(\d+)'
MEM_REGEXP = /#{shared_dirty}#{priv_dirty}/m

# get memory usage
def self.get_memory_map( pids)
memory_map = {}
memory_map[ :pids_found] = {}
memory_map[ :shared_dirty] = 0
memory_map[ :priv_dirty] = 0

pids.each do |pid|
begin
lines = nil
lines = File.read( "/proc/#{pid}/smaps")
rescue
lines = nil
end
if lines
lines.scan(MEM_REGEXP) do |shared_dirty, priv_dirty|
memory_map[ :pids_found][pid] = true
memory_map[ :shared_dirty] += shared_dirty.to_i
memory_map[ :priv_dirty] += priv_dirty.to_i
end
end
end
memory_map[ :pids_found] = memory_map[ :pids_found].keys
return memory_map
end

# get the processes and get the value of the memory usage
def self.memory_usage( )
pids = [ $$]
result = self.get_memory_map( pids)

result[ :pids] = pids
return result
end

# print the values of the private and shared memories
def self.log( process_name='', log_tag="")
if process_name == "header"
puts " %-6s %5s %-12s %10s %10s\n" % ["proces", "pid", "log", "priv_dirty", "shared_dirty"]
else
time = Time.new - $start_time
mem = Memory.memory_usage( )
puts " %-6s %5d %-12s %10d %10d\n" % [process_name, $$, log_tag, mem[:priv_dirty]/1000, mem[:shared_dirty]/1000]
end
end
end

# function to delay the processes a bit
def time_step( n)
while Time.new - $start_time < n
sleep( 0.01)
end
end

# create an object of specified size. The option argument can be changed from 0 to 2 to visualize the behavior of the GC in various cases
#
# case 0 (default) : we make a huge array of small objects by formatting a string
# case 1 : we make a huge array of small objects without formatting a string (we use the to_s function)
# case 2 : we make a smaller array of big objects
def memory_object( size, option=1)
result = []
count = size/20

if option > 3 or option < 1
count.times do
result << "%20.18f" % rand
end
elsif option == 1
count.times do
result << rand.to_s
end
elsif option == 2
count = count/10
count.times do
result << ("%20.18f" % rand)*30
end
end

return result
end

##### main #####

puts "ruby version #{RUBY_VERSION}"

GC.disable

# print the column headers and first line
Memory.log( "header")

# Allocation of memory
big_memory = memory_object( 1000 * 1000 * 10)

Memory.log( "Parent", "post alloc")

lab_time = Time.new - $start_time
if lab_time < 3.9
lab_time = 0
end

# start the forking
pid = fork do
time = 4
time_step( time + lab_time)
Memory.log( "Child", "#{time} initial")

# force GC when nothing happened
GC.enable; GC.start; GC.disable

time = 8
time_step( time + lab_time)
Memory.log( "Child", "#{time} empty GC")

sleep( 1)
STDOUT.flush
exit!
end

time = 4
time_step( time + lab_time)
Memory.log( "Parent", "#{time} fork")

# wait for the child to finish
Process.wait( pid)

Python代码

import re
import time
import os
import random
import sys
import gc

start_time=time.time()

# Monitor use of Resident and Virtual memory.
class Memory:

def __init__(self):
self.shared_dirty = '.+?Shared_Dirty:\s+(\d+)'
self.priv_dirty = '.+?Private_Dirty:\s+(\d+)'
self.MEM_REGEXP = re.compile("{shared_dirty}{priv_dirty}".format(shared_dirty=self.shared_dirty, priv_dirty=self.priv_dirty), re.DOTALL)

# get memory usage
def get_memory_map(self, pids):
memory_map = {}
memory_map[ "pids_found" ] = {}
memory_map[ "shared_dirty" ] = 0
memory_map[ "priv_dirty" ] = 0

for pid in pids:
try:
lines = None

with open( "/proc/{pid}/smaps".format(pid=pid), "r" ) as infile:
lines = infile.read()
except:
lines = None

if lines:
for shared_dirty, priv_dirty in re.findall( self.MEM_REGEXP, lines ):
memory_map[ "pids_found" ][pid] = True
memory_map[ "shared_dirty" ] += int( shared_dirty )
memory_map[ "priv_dirty" ] += int( priv_dirty )

memory_map[ "pids_found" ] = memory_map[ "pids_found" ].keys()
return memory_map

# get the processes and get the value of the memory usage
def memory_usage( self):
pids = [ os.getpid() ]
result = self.get_memory_map( pids)

result[ "pids" ] = pids

return result

# print the values of the private and shared memories
def log( self, process_name='', log_tag=""):
if process_name == "header":
print " %-6s %5s %-12s %10s %10s" % ("proces", "pid", "log", "priv_dirty", "shared_dirty")
else:
global start_time
Time = time.time() - start_time
mem = self.memory_usage( )
print " %-6s %5d %-12s %10d %10d" % (process_name, os.getpid(), log_tag, mem["priv_dirty"]/1000, mem["shared_dirty"]/1000)

# function to delay the processes a bit
def time_step( n):
global start_time
while (time.time() - start_time) < n:
time.sleep( 0.01)

# create an object of specified size. The option argument can be changed from 0 to 2 to visualize the behavior of the GC in various cases
#
# case 0 (default) : we make a huge array of small objects by formatting a string
# case 1 : we make a huge array of small objects without formatting a string (we use the to_s function)
# case 2 : we make a smaller array of big objects
def memory_object( size, option=2):
count = size/20

if option > 3 or option < 1:
result = [ "%20.18f"% random.random() for i in xrange(count) ]

elif option == 1:
result = [ str( random.random() ) for i in xrange(count) ]

elif option == 2:
count = count/10
result = [ ("%20.18f"% random.random())*30 for i in xrange(count) ]

return result

##### main #####

print "python version {version}".format(version=sys.version)

memory = Memory()

gc.disable()

# print the column headers and first line
memory.log( "header") # Print the headers of the columns

# Allocation of memory
big_memory = memory_object( 1000 * 1000 * 10) # Allocate memory

memory.log( "Parent", "post alloc")

lab_time = time.time() - start_time
if lab_time < 3.9:
lab_time = 0

# start the forking
pid = os.fork() # fork the process
if pid == 0:
Time = 4
time_step( Time + lab_time)
memory.log( "Child", "{time} initial".format(time=Time))

# force GC when nothing happened
gc.enable(); gc.collect(); gc.disable();

Time = 10
time_step( Time + lab_time)
memory.log( "Child", "{time} empty GC".format(time=Time))

time.sleep( 1)

sys.exit(0)

Time = 4
time_step( Time + lab_time)
memory.log( "Parent", "{time} fork".format(time=Time))

# Wait for child process to finish
os.waitpid( pid, 0)

编辑

确实,在 fork 进程之前多次调用 GC 解决了这个问题,我很惊讶。我也使用 Ruby 2.0.0 运行代码,但问题甚至没有出现,因此它必须与您提到的这一代 GC 相关。但是,如果我调用 memory_object 函数而不将输出分配给任何变量(我只是在创建垃圾),那么内存就会被复制。复制的内存量取决于我创建的垃圾量 - 垃圾越多,私有(private)内存就越多。

有什么办法可以防止这种情况发生吗?

结果如下

在 2.0.0 中运行 GC

ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 3664 post alloc 67 0
Parent 3664 4 fork 1 69
Child 3700 4 initial 1 69
Child 3700 8 empty GC 6 65

在子进程中调用memory_object(1000*1000)

ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 3703 post alloc 67 0
Parent 3703 4 fork 1 70
Child 3739 4 initial 1 70
Child 3739 8 empty GC 15 56

调用内存对象(1000*1000*10)

ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 3743 post alloc 67 0
Parent 3743 4 fork 1 69
Child 3779 4 initial 1 69
Child 3779 8 empty GC 89 5

最佳答案

UPD2

突然想通了为什么在格式化字符串时所有内存都变为私有(private)——格式化期间会产生垃圾,禁用 GC,然后启用 GC,并且在生成的数据中有已释放对象的漏洞。然后你 fork ,新的垃圾开始占据这些洞,垃圾越多 - 私有(private)页面越多。

所以我添加了一个清理函数以每 2000 个周期运行一次 GC(只是启用惰性 GC 没有帮助):

count.times do |i|
cleanup(i)
result << "%20.18f" % rand
end

#......snip........#

def cleanup(i)
if ((i%2000).zero?)
GC.enable; GC.start; GC.disable
end
end

##### main #####

这导致(在 fork 之后生成 memory_object( 1000 * 1000 * 10)):

RUBY_GC_HEAP_INIT_SLOTS=600000 ruby gc-test.rb 0
ruby version 2.2.0
proces pid log priv_dirty shared_dirty
Parent 2501 post alloc 35 0
Parent 2501 4 fork 0 35
Child 2503 4 initial 0 35
Child 2503 8 empty GC 28 22

是的,它会影响性能,但只会在 fork 之前,即在您的情况下会增加加载时间。


UPD1

刚找到criteria ruby 2.2 通过它设置旧对象位,它是 3 个 GC,所以如果你在 fork 之前添加以下内容:

GC.enable; 3.times {GC.start}; GC.disable
# start the forking

你会得到(命令行中的选项是1):

$ RUBY_GC_HEAP_INIT_SLOTS=600000 ruby gc-test.rb 1
ruby version 2.2.0
proces pid log priv_dirty shared_dirty
Parent 2368 post alloc 31 0
Parent 2368 4 fork 1 34
Child 2370 4 initial 1 34
Child 2370 8 empty GC 2 32

但这需要进一步测试这些对象在未来 GC 上的行为,至少在 100 次 GC 之后 :old_objects 保持不变,所以我想它应该没问题

GC.stat 记录是 here


顺便说一句,还有一个选项 RGENGC_OLD_NEWOBJ_CHECK从头开始创建旧对象,但我怀疑这是个好主意,但可能对特定情况有用。

第一个答案

我在上面评论中的主张是错误的,实际上位图表是救世主。

(option = 1)

ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 14807 post alloc 27 0
Parent 14807 4 fork 0 27
Child 14809 4 initial 0 27
Child 14809 8 empty GC 6 25 # << almost everything stays shared <<

还手动测试了 Ruby 企业版,它只比最坏的情况好一半。

ruby version 1.8.7
proces pid log priv_dirty shared_dirty
Parent 15064 post alloc 86 0
Parent 15064 4 fork 2 84
Child 15065 4 initial 2 84
Child 15065 8 empty GC 40 46

(我通过将 RUBY_GC_HEAP_INIT_SLOTS 增加到 600k 使脚本严格运行 1 次 GC)

关于ruby - Ruby 2.2 中的垃圾收集器引发意想不到的 CoW,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29900458/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com