gpt4 book ai didi

c++ - 了解Linux虚拟内存: valgrind's massif output shows major differences with and without --pages-as-heap

转载 作者:IT王子 更新时间:2023-10-29 00:01:32 27 4
gpt4 key购买 nike

我已经阅读了有关此参数的文档,但差异确实很大!启用后,一个简单程序(见下文)的内存使用量约为 7 GB 当它被禁用时,报告的使用情况约为 160 KB .
top还显示大约 7 GB,有点 确认 结果与 pages-as-heap=yes .

(我有一个理论,但我不相信它可以解释如此巨大的差异,所以 - 寻求帮助)。

特别困扰我的是,报告的大部分内存使用情况都被 std::string 使用了。 , 而 what?从不打印(意思是 - 实际容量非常小)。

我确实需要使用 pages-as-heap=yes在分析我的应用程序时,我只是想知道如何避免“误报”

代码片段:

#include <iostream>
#include <thread>
#include <vector>
#include <chrono>

void run()
{
while (true)
{
std::string s;
s += "aaaaa";
s += "aaaaaaaaaaaaaaa";
s += "bbbbbbbbbb";
s += "cccccccccccccccccccccccccccccccccccccccccccccccccc";
if (s.capacity() > 1024) std::cout << "what?" << std::endl;

std::this_thread::sleep_for(std::chrono::seconds(1));
}
}

int main()
{
std::vector<std::thread> workers;
for( unsigned i = 0; i < 192; ++i ) workers.push_back(std::thread(&run));

workers.back().join();
}

编译: g++ --std=c++11 -fno-inline -g3 -pthread
pages-as-heap=yes :
100.00% (7,257,714,688B) (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
->99.75% (7,239,757,824B) 0x54E0679: mmap (mmap.c:34)
| ->53.63% (3,892,314,112B) 0x545C3CF: new_heap (arena.c:438)
| | ->53.63% (3,892,314,112B) 0x545CC1F: arena_get2.part.3 (arena.c:646)
| | ->53.63% (3,892,314,112B) 0x5463248: malloc (malloc.c:2911)
| | ->53.63% (3,892,314,112B) 0x4CB7E76: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->53.63% (3,892,314,112B) 0x4CF8E37: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->53.63% (3,892,314,112B) 0x4CF9C69: std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->53.63% (3,892,314,112B) 0x4CF9D22: std::string::reserve(unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->53.63% (3,892,314,112B) 0x4CF9FB1: std::string::append(char const*, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->53.63% (3,892,314,112B) 0x401252: run() (test.cpp:11)
| | ->53.63% (3,892,314,112B) 0x403929: void std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (functional:1700)
| | ->53.63% (3,892,314,112B) 0x403864: std::_Bind_simple<void (*())()>::operator()() (functional:1688)
| | ->53.63% (3,892,314,112B) 0x4037D2: std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run() (thread:115)
| | ->53.63% (3,892,314,112B) 0x4CE2C7E: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->53.63% (3,892,314,112B) 0x51C96B8: start_thread (pthread_create.c:333)
| | ->53.63% (3,892,314,112B) 0x54E63DB: clone (clone.S:109)
| |
| ->35.14% (2,550,136,832B) 0x545C35B: new_heap (arena.c:427)
| | ->35.14% (2,550,136,832B) 0x545CC1F: arena_get2.part.3 (arena.c:646)
| | ->35.14% (2,550,136,832B) 0x5463248: malloc (malloc.c:2911)
| | ->35.14% (2,550,136,832B) 0x4CB7E76: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->35.14% (2,550,136,832B) 0x4CF8E37: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->35.14% (2,550,136,832B) 0x4CF9C69: std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->35.14% (2,550,136,832B) 0x4CF9D22: std::string::reserve(unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->35.14% (2,550,136,832B) 0x4CF9FB1: std::string::append(char const*, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->35.14% (2,550,136,832B) 0x401252: run() (test.cpp:11)
| | ->35.14% (2,550,136,832B) 0x403929: void std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (functional:1700)
| | ->35.14% (2,550,136,832B) 0x403864: std::_Bind_simple<void (*())()>::operator()() (functional:1688)
| | ->35.14% (2,550,136,832B) 0x4037D2: std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run() (thread:115)
| | ->35.14% (2,550,136,832B) 0x4CE2C7E: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->35.14% (2,550,136,832B) 0x51C96B8: start_thread (pthread_create.c:333)
| | ->35.14% (2,550,136,832B) 0x54E63DB: clone (clone.S:109)
| |
| ->10.99% (797,306,880B) 0x51CA1D4: pthread_create@@GLIBC_2.2.5 (allocatestack.c:513)
| ->10.99% (797,306,880B) 0x4CE2DC1: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)()) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->10.99% (797,306,880B) 0x4CE2ECB: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->10.99% (797,306,880B) 0x401BEA: std::thread::thread<void (*)()>(void (*&&)()) (thread:138)
| ->10.99% (797,306,880B) 0x401353: main (test.cpp:24)
|
->00.25% (17,956,864B) in 1+ places, all below ms_print's threshold (01.00%)

同时与 pages-as-heap=no :
96.38% (159,289B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->43.99% (72,704B) 0x4EBAEFE: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->43.99% (72,704B) 0x40106B8: call_init.part.0 (dl-init.c:72)
| ->43.99% (72,704B) 0x40107C9: _dl_init (dl-init.c:30)
| ->43.99% (72,704B) 0x4000C68: ??? (in /lib/x86_64-linux-gnu/ld-2.23.so)
|
->33.46% (55,296B) 0x40138A3: _dl_allocate_tls (dl-tls.c:322)
| ->33.46% (55,296B) 0x53D126D: pthread_create@@GLIBC_2.2.5 (allocatestack.c:588)
| ->33.46% (55,296B) 0x4EE9DC1: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)()) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->33.46% (55,296B) 0x4EE9ECB: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->33.46% (55,296B) 0x401BEA: std::thread::thread<void (*)()>(void (*&&)()) (thread:138)
| ->33.46% (55,296B) 0x401353: main (test.cpp:24)
|
->12.12% (20,025B) 0x4EFFE37: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->12.12% (20,025B) 0x4F00C69: std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->12.12% (20,025B) 0x4F00D22: std::string::reserve(unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->12.12% (20,025B) 0x4F00FB1: std::string::append(char const*, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| ->12.07% (19,950B) 0x401285: run() (test.cpp:14)
| | ->12.07% (19,950B) 0x403929: void std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (functional:1700)
| | ->12.07% (19,950B) 0x403864: std::_Bind_simple<void (*())()>::operator()() (functional:1688)
| | ->12.07% (19,950B) 0x4037D2: std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run() (thread:115)
| | ->12.07% (19,950B) 0x4EE9C7E: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
| | ->12.07% (19,950B) 0x53D06B8: start_thread (pthread_create.c:333)
| | ->12.07% (19,950B) 0x56ED3DB: clone (clone.S:109)
| |
| ->00.05% (75B) in 1+ places, all below ms_print's threshold (01.00%)
|
->05.58% (9,216B) 0x40315B: __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) (new_allocator.h:104)
| ->05.58% (9,216B) 0x402FC2: std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, (__gnu_cxx::_Lock_policy)2> > >::allocate(std::allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, (__gnu_cxx::_Lock_policy)2> >&, unsigned long) (alloc_traits.h:488)
| ->05.58% (9,216B) 0x402D4B: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, std::_Bind_simple<void (*())()> >(std::_Sp_make_shared_tag, std::thread::_Impl<std::_Bind_simple<void (*())()> >*, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > > const&, std::_Bind_simple<void (*())()>&&) (shared_ptr_base.h:616)
| ->05.58% (9,216B) 0x402BDE: std::__shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*())()> >, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, std::_Bind_simple<void (*())()> >(std::_Sp_make_shared_tag, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > > const&, std::_Bind_simple<void (*())()>&&) (shared_ptr_base.h:1090)
| ->05.58% (9,216B) 0x402A76: std::shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*())()> > >::shared_ptr<std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, std::_Bind_simple<void (*())()> >(std::_Sp_make_shared_tag, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > > const&, std::_Bind_simple<void (*())()>&&) (shared_ptr.h:316)
| ->05.58% (9,216B) 0x402771: std::shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*())()> > > std::allocate_shared<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, std::_Bind_simple<void (*())()> >(std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > > const&, std::_Bind_simple<void (*())()>&&) (shared_ptr.h:594)
| ->05.58% (9,216B) 0x402325: std::shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*())()> > > std::make_shared<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::_Bind_simple<void (*())()> >(std::_Bind_simple<void (*())()>&&) (shared_ptr.h:610)
| ->05.58% (9,216B) 0x401F9C: std::shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*())()> > > std::thread::_M_make_routine<std::_Bind_simple<void (*())()> >(std::_Bind_simple<void (*())()>&&) (thread:196)
| ->05.58% (9,216B) 0x401BC4: std::thread::thread<void (*)()>(void (*&&)()) (thread:138)
| ->05.58% (9,216B) 0x401353: main (test.cpp:24)
|
->01.24% (2,048B) 0x402C9A: __gnu_cxx::new_allocator<std::thread>::allocate(unsigned long, void const*) (new_allocator.h:104)
->01.24% (2,048B) 0x402AF5: std::allocator_traits<std::allocator<std::thread> >::allocate(std::allocator<std::thread>&, unsigned long) (alloc_traits.h:488)
->01.24% (2,048B) 0x402928: std::_Vector_base<std::thread, std::allocator<std::thread> >::_M_allocate(unsigned long) (stl_vector.h:170)
->01.24% (2,048B) 0x40244E: void std::vector<std::thread, std::allocator<std::thread> >::_M_emplace_back_aux<std::thread>(std::thread&&) (vector.tcc:412)
->01.24% (2,048B) 0x40206D: void std::vector<std::thread, std::allocator<std::thread> >::emplace_back<std::thread>(std::thread&&) (vector.tcc:101)
->01.24% (2,048B) 0x401C82: std::vector<std::thread, std::allocator<std::thread> >::push_back(std::thread&&) (stl_vector.h:932)
->01.24% (2,048B) 0x401366: main (test.cpp:24)

请忽略线程的糟糕处理,这只是一个非常简短的示例。

更新

看来,这与 std::string 无关。根本。正如@Lawrence 所建议的,这可以通过简单地分配一个 int 来重现。在堆上(使用 new )。我相信@Lawrence 非常接近这里的真实答案,引用他的评论(对进一步的读者来说更容易):

劳伦斯:

@KirilKirov The string allocation is not actually taking that much space... Each thread gets it's initial stack and then heap access maps some large amount of space (around 70m) that gets inaccurately reflected. You can measure it by just declaring 1 string and then having a spin loop... the same virtual memory usage is shown – Lawrence Sep 28 at 14:51



我:

@Lawrence - you're damn right! OK, so, you're saying (and it appears to be like this), that on each thread, on the first heap allocation, the memory manager (or the OS, or whatever) dedicates huge chunk of memory for the threads' heap needs? And this chunk will be reused later (or shrinked, if necessary)? – Kiril Kirov Sep 28 at 15:45



劳伦斯:

@KirilKirov something of that nature... exact allocations probably depends on malloc implementation and whatnot – Lawrence 2 days ago

最佳答案

massif--pages-as-heap=yestop您正在观察的列都测量进程使用的虚拟内存。此虚拟内存包括所有空间 mmap 'd 在 malloc 的实现和线程的创建过程中。例如,线程的默认堆栈大小将为 8192k这反射(reflect)在每个线程的创建中,并有助于虚拟内存占用。

具体的分配方案将取决于实现,但似乎新线程上的第一个堆分配将 mmap大约 65 兆字节的空间。这可以通过查看 pmap 来查看。进程的输出。

摘自与示例非常相似的程序:

75170:   ./a.out
0000000000400000 24K r-x-- a.out
0000000000605000 4K r---- a.out
0000000000606000 4K rw--- a.out
0000000001b6a000 200K rw--- [ anon ]
00007f669dfa4000 4K ----- [ anon ]
00007f669dfa5000 8192K rw--- [ anon ]
00007f669e7a5000 4K ----- [ anon ]
00007f669e7a6000 8192K rw--- [ anon ]
00007f669efa6000 4K ----- [ anon ]
00007f669efa7000 8192K rw--- [ anon ]
...
00007f66cb800000 8192K rw--- [ anon ]
00007f66cc000000 132K rw--- [ anon ]
00007f66cc021000 65404K ----- [ anon ]
00007f66d0000000 132K rw--- [ anon ]
00007f66d0021000 65404K ----- [ anon ]
00007f66d4000000 132K rw--- [ anon ]
00007f66d4021000 65404K ----- [ anon ]
...
00007f6880586000 8192K rw--- [ anon ]
00007f6880d86000 1056K r-x-- libm-2.23.so
00007f6880e8e000 2044K ----- libm-2.23.so
...
00007f6881c08000 4K r---- libpthread-2.23.so
00007f6881c09000 4K rw--- libpthread-2.23.so
00007f6881c0a000 16K rw--- [ anon ]
00007f6881c0e000 152K r-x-- ld-2.23.so
00007f6881e09000 24K rw--- [ anon ]
00007f6881e33000 4K r---- ld-2.23.so
00007f6881e34000 4K rw--- ld-2.23.so
00007f6881e35000 4K rw--- [ anon ]
00007ffe9d75b000 132K rw--- [ stack ]
00007ffe9d7f8000 12K r---- [ anon ]
00007ffe9d7fb000 8K r-x-- [ anon ]
ffffffffff600000 4K r-x-- [ anon ]
total 7815008K

当您接近每个进程的虚拟内存阈值时,malloc 似乎变得更加保守。此外,我关于单独映射库的评论被误导了(它们应该每个进程共享)

关于c++ - 了解Linux虚拟内存: valgrind's massif output shows major differences with and without --pages-as-heap,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52532893/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com