|
1 Trace Malloc Tools |
|
2 Chris Waterson <waterson@netscape.com> |
|
3 November 27, 2000 |
|
4 |
|
5 This is a short primer on how to use the `trace malloc' tools |
|
6 contained in this directory. |
|
7 |
|
8 |
|
9 WHAT IS TRACE MALLOC? |
|
10 ===================== |
|
11 |
|
12 Trace malloc is an optional facility that is built in to XPCOM. It |
|
13 uses `weak linking' to intercept all calls to malloc(), calloc(), |
|
14 realloc() and free(). It does two things: |
|
15 |
|
16 1. Writes information about allocations to a filehandle that you |
|
17 specify. As each call to malloc(), et. al. is made, a record is logged |
|
18 to the filehandle. |
|
19 |
|
20 2. Maintains a table of all `live objects' -- that is, objects that |
|
21 have been allocated by malloc(), calloc() or realloc(), but have not |
|
22 yet been free()'d. The contents of this table can be called by making |
|
23 a `secret' call to JavaScript. |
|
24 |
|
25 |
|
26 MAKING A TRACE MALLOC BUILD |
|
27 =========================== |
|
28 |
|
29 As of this writing, trace malloc only works on Linux, but work is |
|
30 underway to port it to Windows. |
|
31 |
|
32 On Linux, start with a clean tree, and configure your build with the |
|
33 following flags: |
|
34 |
|
35 --enable-trace-malloc |
|
36 --enable-cpp-rtti |
|
37 |
|
38 Be sure that `--enable-boehm' is *not* set. I don't think that the |
|
39 values for `--enable-debug' and `--enable-optimize' matter, but I've |
|
40 typically had debug on and optimize off. |
|
41 |
|
42 |
|
43 COLLECTING LIVE OBJECT DATA |
|
44 =========================== |
|
45 |
|
46 To collect `live object' data from `mozilla' using a build that has |
|
47 trace malloc enabled, |
|
48 |
|
49 1. Run `mozilla' as follows: |
|
50 |
|
51 % mozilla --trace-malloc /dev/null |
|
52 |
|
53 2. Do whatever operations in mozilla you'd like to test. |
|
54 |
|
55 3. Open the `live-bloat.html' file contained in this directory. |
|
56 |
|
57 4. Press the button that says `Dump to /tmp/dump.log' |
|
58 |
|
59 An enormous file (typically 300MB) called `dump.log' will be dropped |
|
60 in your `/tmp' directory. |
|
61 |
|
62 To collect live object data from `gtkEmbed' using a build that has |
|
63 trace malloc enabled: |
|
64 |
|
65 1. Run `gtkEmbed' as follows: |
|
66 |
|
67 % gtkEmbed --trace-malloc /dev/null |
|
68 |
|
69 2. Do whatever operations in gtkEmbed that you'd like to test. |
|
70 |
|
71 3. Press the `Dump Memory' button at the bottom of gtkEmbed. |
|
72 |
|
73 The enormous file will be dropped in the current directory, and is |
|
74 called `allocations.log'. |
|
75 |
|
76 |
|
77 About Live Object Logs |
|
78 ---------------------- |
|
79 |
|
80 A typical entry from the `live object' dump file will look like: |
|
81 |
|
82 Address Type Size |
|
83 | | | |
|
84 v v v |
|
85 0x40008080 <nsFooBar> 16 |
|
86 0x00000001 <- Fields |
|
87 0x40008084 |
|
88 0x80004001 |
|
89 0x00000036 |
|
90 __builtin_new[./libxpcom.so +0x10E9DC] <- Stack at allocation time |
|
91 nsFooBar::CreateFooBar(nsFooBar **)[./libfoobar.so +0x408C] |
|
92 main+C7E5AFB5[(null) +0xC7E5AFB5] |
|
93 |
|
94 One will be printed for each object that was allocated. |
|
95 |
|
96 |
|
97 TOOLS TO PARSE LIVE OBJECT LOGS |
|
98 =============================== |
|
99 |
|
100 This directory is meant to house the tools that you can use to parse |
|
101 live-object logs. |
|
102 |
|
103 Object Histograms - histogram.pl |
|
104 -------------------------------- |
|
105 |
|
106 This program parses a `live object' dump and produces a histogram of |
|
107 the objects, sorted from objects that take the most memory to objects |
|
108 that take the least. The output of this program is rather spartan: on |
|
109 each line, it prints the object type, the number of objects of that |
|
110 type, and the total number of bytes that the objects consume. |
|
111 |
|
112 There are a two simple programs to `pretty print' the output from |
|
113 histogram.pl: |
|
114 |
|
115 1. histogram-pretty.sh takes a single histogram and produces a table |
|
116 of objects. |
|
117 |
|
118 Type Count Bytes %Total |
|
119 TOTAL 67348 4458127 100.00 |
|
120 nsImageGTK 76 679092 15.23 |
|
121 void* 8956 563572 12.64 |
|
122 ... |
|
123 PRLock 732 61488 1.38 |
|
124 OTHER 24419 940235 21.09 |
|
125 |
|
126 2. histogram-diff.sh takes two histograms and computes the difference |
|
127 between them. |
|
128 |
|
129 ---- Base ---- ---- Incr ---- ----- Difference ---- |
|
130 Type Count Bytes Count Bytes Count Bytes %Total |
|
131 TOTAL 40241 1940945 73545 5315142 33304 3374197 100.00 |
|
132 nsImageGTK 16 106824 151 832816 135 725992 21.52 |
|
133 PresShell 16 51088 198 340706 182 289618 8.58 |
|
134 ... |
|
135 OTHER 27334 1147033 38623 1493385 11289 346352 10.26 |
|
136 |
|
137 Both of these scripts accept `-c' parameter that specifies how many |
|
138 rows you'd like to see (by default, twenty). Any rows past the first |
|
139 `n' rows are lumped into a single `OTHER' row. This allows you to keep |
|
140 your reports short n' sweet. |
|
141 |
|
142 Stack-based Type Inference - types.dat |
|
143 -------------------------------------- |
|
144 |
|
145 Trace malloc uses `speculative RTTI' to determine the types of objects |
|
146 as it dumps them. Unfortunately, RTTI can only deduce the type name |
|
147 for C++ objects with a virtual destructor. |
|
148 |
|
149 This leaves: |
|
150 |
|
151 . C++ object without a virtual destructor |
|
152 . array allocated C++ objects, and |
|
153 . objects allocated with the C runtime function (malloc |
|
154 and friends) |
|
155 |
|
156 out in the cold. Trace malloc reports objects allocated this was as |
|
157 having type `void*'. |
|
158 |
|
159 The good news is that you can almost always determine the object's |
|
160 type by looking at the stack trace that's taken at the time the object |
|
161 is allocated. |
|
162 |
|
163 The file `types.dat' consists of rules to classify objects based on |
|
164 stack trace. |
|
165 |
|
166 |
|
167 Uncategorized Objects - uncategorized.pl |
|
168 ---------------------------------------- |
|
169 |
|
170 Categorizing objects in `types.dat' is sweaty work, and the |
|
171 `uncategorized.pl' script is a tool that makes it a bit |
|
172 easier. Specifically, it reads a `live object' dump file and sorts the |
|
173 stack traces. Stack traces that account for the most uncategorized |
|
174 objects are placed first. |
|
175 |
|
176 Using this tool, you can add the `most effective' rules to |
|
177 `types.dat': rules that account for most of the uncategorized data. |