内存调试

目标

  • 获取进程内存分配的调用栈,内存占比火焰图;
  • 获得真实的in use内存数据,即不包含tcmalloc/ptmalloc的缓存;

原理

google tcmalloc替换glibc ptmalloc,在api中加代码桩。

实践

依赖项

  • 火焰图:https://github.com/brendangregg/FlameGraph.git
  • gperftools:https://github.com/gperftools/gperftools.git,本文中gperftools的安装操作如下:
1 # gperftools 安装路径
2 gperf_install_base_path="/var/.gperftools/release"
3 
4 # 编译安装
5 cd gperftools && ./autogen.sh && ./configure --prefix=${gperf_install_base_path}/ && make all -j2 && sudo make install

测试代码

将如下代码命名为 t_gperf_tools.cc

 1 #include <stdio.h>
 2 #include <stdlib.h>
 3 #include <malloc.h>
 4 #include <unistd.h>
 5 #include <vector>
 6 #include <map>
 7 #include <iostream>
 8 #include <thread>
 9 
10 #define GPERFTOOLS_EN (1)
11 #if GPERFTOOLS_EN == 1
12 #include "gperftools/heap-profiler.h"
13 #include "gperftools/malloc_extension.h"
14 #endif
15 
16 void MallocTestC() {
17   uint64_t *a;
18   while (1) {
19     a = (uint64_t *)calloc(1024 * 1024, sizeof(uint64_t));
20     sleep(1);
21     printf("Leak size %zu MB for %p\n", sizeof(uint64_t), a);
22   }
23 }
24 
25 void MallocTestCPP() {
26   static std::vector<std::map<uint64_t, std::shared_ptr<std::vector<uint64_t>>>>
27       map_vec;
28   while (1) {
29     if (map_vec.size() >= 40) {
30       malloc_stats();
31       map_vec.clear();
32       map_vec.shrink_to_fit();
33       // MallocExtension::instance()->ReleaseFreeMemory();
34       malloc_stats();
35     }
36 
37     std::map<uint64_t, std::shared_ptr<std::vector<uint64_t>>> map;
38     for (size_t i = 0U; i < 200; i++) {
39       std::shared_ptr<std::vector<uint64_t>> vec =
40           std::make_shared<std::vector<uint64_t>>();
41       //   vec->reserve(1024U);
42       vec->resize(1024U);
43       map.emplace(i, vec);
44     }
45     map_vec.emplace_back(map);
46     usleep(200000);
47   }
48 }
49 
50 int main(int argc, char **argv) {
51   (void)argc;
52   (void)argv;
53 
54 #if GPERFTOOLS_EN == 1
55   HeapProfilerStart("/tmp/t_gperf_tools_O0");
56 #endif
57 
58   std::thread thr{MallocTestCPP};
59   MallocTestC();
60   thr.join();
61 
62 #if GPERFTOOLS_EN == 1
63   HeapProfilerStop();
64 #endif
65   return 0;
66 }

 

内存火焰图

为了看到更完美的调用栈,我们采用-O0编译

1 # 编译
2 g++ t_gperf_tools.c -O0 -g -o t_gperf_tools -lpthread -ltcmalloc -L/var/.gperftools/release/lib -I/var/.gperftools/release/include
3 
4 # 运行,以1秒为间隔输出HeapProfiler文件
5 LD_LIBRARY_PATH=/var/.gperftools/release/lib HEAP_PROFILE_TIME_INTERVAL="1" ./t_gperf_tools

运行后,能看到输出的 HeapProfiler 文件

 1 Starting tracking the heap
 2 Dumping heap profile to /tmp/t_gperf_tools_O0.0001.heap (1667187833 sec since the last dump)
 3 Dumping heap profile to /tmp/t_gperf_tools_O0.0002.heap (1 sec since the last dump)
 4 Leak size 8 MB for 0x562deff26000
 5 Dumping heap profile to /tmp/t_gperf_tools_O0.0003.heap (1 sec since the last dump)
 6 Leak size 8 MB for 0x562df0f1c000
 7 Dumping heap profile to /tmp/t_gperf_tools_O0.0004.heap (1 sec since the last dump)
 8 Leak size 8 MB for 0x562df1f0a000
 9 Dumping heap profile to /tmp/t_gperf_tools_O0.0005.heap (1 sec since the last dump)
10 Leak size 8 MB for 0x562df2ef4000
11 Dumping heap profile to /tmp/t_gperf_tools_O0.0006.heap (1 sec since the last dump)
12 Leak size 8 MB for 0x562df3d50000
13 Dumping heap profile to /tmp/t_gperf_tools_O0.0007.heap (1 sec since the last dump)
14 Leak size 8 MB for 0x562df4d3c000
15 Dumping heap profile to /tmp/t_gperf_tools_O0.0008.heap (1 sec since the last dump)
16 Leak size 8 MB for 0x562df5d26000
17 Dumping heap profile to /tmp/t_gperf_tools_O0.0009.heap (1 sec since the last dump)
18 Leak size 8 MB for 0x562df6b80000

生成火焰图

1 # 解析 HeapProfiler 文件
2 /var/.gperftools/release/bin/pprof --collapsed ./t_gperf_tools /tmp/t_gperf_tools_O0.0009.heap > gperf.stacks
3 
4 # 生成火焰图
5 cat gperf.stacks | /home/user/disk/prjs/perf/mdc_perf/perf/FlameGraph/flamegraph.pl --color=mem --title="malloc() Flame Graph" --countname="calls" > gperf.svg
6 
7 # 打开火焰图
8 google-chrome gperf.svg

火焰图如下,对比代码,可清晰看到内存的消耗位置

 

进程实际in_use内存量

malloc_stats

应用程序申请内存,常规流程是经由glibc再到kernel的syscall,而glibc的内存管理为了减少申请/释放内存的系统调用,会为做一层 内存 的 缓存。

无论是glibc默认的ptmalloc,还是google的tcmalloc,都提供了“malloc_stats”API,用于获取当前进程总的内存占用量,以及实际的使用量,两者相减即为 此进程在glibc的缓存大小。

此处以tcmalloc为例,对应代码中的30、34两行,两行的输出对比可知,即使做了shrink_to_fit,应用释放内存也只是回到了page heap freelist。

# 30行 malloc_stats 的打印
------------------------------------------------
MALLOC:      142024744 (  135.4 MiB) Bytes in use by application
MALLOC: +       442368 (    0.4 MiB) Bytes in page heap freelist
MALLOC: +       120760 (    0.1 MiB) Bytes in central cache freelist
MALLOC: +            0 (    0.0 MiB) Bytes in transfer cache freelist
MALLOC: +        18464 (    0.0 MiB) Bytes in thread cache freelists
MALLOC: +      2752512 (    2.6 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =    145358848 (  138.6 MiB) Actual memory used (physical + swap)
MALLOC: +            0 (    0.0 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =    145358848 (  138.6 MiB) Virtual address space used
MALLOC:
MALLOC:           4139              Spans in use
MALLOC:              3              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.

# 34行 malloc_stats 的打印
------------------------------------------------
MALLOC:       75589672 (   72.1 MiB) Bytes in use by application
MALLOC: +     60833792 (   58.0 MiB) Bytes in page heap freelist
MALLOC: +      1231224 (    1.2 MiB) Bytes in central cache freelist
MALLOC: +      1277952 (    1.2 MiB) Bytes in transfer cache freelist
MALLOC: +      3673696 (    3.5 MiB) Bytes in thread cache freelists
MALLOC: +      2752512 (    2.6 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =    145358848 (  138.6 MiB) Actual memory used (physical + swap)
MALLOC: +            0 (    0.0 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =    145358848 (  138.6 MiB) Virtual address space used
MALLOC:
MALLOC:            596              Spans in use
MALLOC:              3              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------

MallocExtension::instance()->ReleaseFreeMemory()

当应用想要将 glibc 的缓存 手动还给 内核时,tcmalloc 提供了 ReleaseFreeMemory 的API。

CPU性能调试

原文地址:http://www.cnblogs.com/zengjianrong/p/16843885.html

1. 本站所有资源来源于用户上传和网络,如有侵权请邮件联系站长! 2. 分享目的仅供大家学习和交流,请务用于商业用途! 3. 如果你也有好源码或者教程,可以到用户中心发布,分享有积分奖励和额外收入! 4. 本站提供的源码、模板、插件等等其他资源,都不包含技术服务请大家谅解! 5. 如有链接无法下载、失效或广告,请联系管理员处理! 6. 本站资源售价只是赞助,收取费用仅维持本站的日常运营所需! 7. 如遇到加密压缩包,默认解压密码为"gltf",如遇到无法解压的请联系管理员! 8. 因为资源和程序源码均为可复制品,所以不支持任何理由的退款兑现,请斟酌后支付下载 声明:如果标题没有注明"已测试"或者"测试可用"等字样的资源源码均未经过站长测试.特别注意没有标注的源码不保证任何可用性