1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 1.2 +++ b/tools/trace-malloc/README Wed Dec 31 06:09:35 2014 +0100 1.3 @@ -0,0 +1,177 @@ 1.4 + Trace Malloc Tools 1.5 + Chris Waterson <waterson@netscape.com> 1.6 + November 27, 2000 1.7 + 1.8 +This is a short primer on how to use the `trace malloc' tools 1.9 +contained in this directory. 1.10 + 1.11 + 1.12 +WHAT IS TRACE MALLOC? 1.13 +===================== 1.14 + 1.15 +Trace malloc is an optional facility that is built in to XPCOM. It 1.16 +uses `weak linking' to intercept all calls to malloc(), calloc(), 1.17 +realloc() and free(). It does two things: 1.18 + 1.19 +1. Writes information about allocations to a filehandle that you 1.20 +specify. As each call to malloc(), et. al. is made, a record is logged 1.21 +to the filehandle. 1.22 + 1.23 +2. Maintains a table of all `live objects' -- that is, objects that 1.24 +have been allocated by malloc(), calloc() or realloc(), but have not 1.25 +yet been free()'d. The contents of this table can be called by making 1.26 +a `secret' call to JavaScript. 1.27 + 1.28 + 1.29 +MAKING A TRACE MALLOC BUILD 1.30 +=========================== 1.31 + 1.32 +As of this writing, trace malloc only works on Linux, but work is 1.33 +underway to port it to Windows. 1.34 + 1.35 +On Linux, start with a clean tree, and configure your build with the 1.36 +following flags: 1.37 + 1.38 + --enable-trace-malloc 1.39 + --enable-cpp-rtti 1.40 + 1.41 +Be sure that `--enable-boehm' is *not* set. I don't think that the 1.42 +values for `--enable-debug' and `--enable-optimize' matter, but I've 1.43 +typically had debug on and optimize off. 1.44 + 1.45 + 1.46 +COLLECTING LIVE OBJECT DATA 1.47 +=========================== 1.48 + 1.49 +To collect `live object' data from `mozilla' using a build that has 1.50 +trace malloc enabled, 1.51 + 1.52 + 1. Run `mozilla' as follows: 1.53 + 1.54 + % mozilla --trace-malloc /dev/null 1.55 + 1.56 + 2. Do whatever operations in mozilla you'd like to test. 1.57 + 1.58 + 3. Open the `live-bloat.html' file contained in this directory. 1.59 + 1.60 + 4. Press the button that says `Dump to /tmp/dump.log' 1.61 + 1.62 +An enormous file (typically 300MB) called `dump.log' will be dropped 1.63 +in your `/tmp' directory. 1.64 + 1.65 +To collect live object data from `gtkEmbed' using a build that has 1.66 +trace malloc enabled: 1.67 + 1.68 + 1. Run `gtkEmbed' as follows: 1.69 + 1.70 + % gtkEmbed --trace-malloc /dev/null 1.71 + 1.72 + 2. Do whatever operations in gtkEmbed that you'd like to test. 1.73 + 1.74 + 3. Press the `Dump Memory' button at the bottom of gtkEmbed. 1.75 + 1.76 +The enormous file will be dropped in the current directory, and is 1.77 +called `allocations.log'. 1.78 + 1.79 + 1.80 +About Live Object Logs 1.81 +---------------------- 1.82 + 1.83 +A typical entry from the `live object' dump file will look like: 1.84 + 1.85 + Address Type Size 1.86 + | | | 1.87 + v v v 1.88 + 0x40008080 <nsFooBar> 16 1.89 + 0x00000001 <- Fields 1.90 + 0x40008084 1.91 + 0x80004001 1.92 + 0x00000036 1.93 + __builtin_new[./libxpcom.so +0x10E9DC] <- Stack at allocation time 1.94 + nsFooBar::CreateFooBar(nsFooBar **)[./libfoobar.so +0x408C] 1.95 + main+C7E5AFB5[(null) +0xC7E5AFB5] 1.96 + 1.97 +One will be printed for each object that was allocated. 1.98 + 1.99 + 1.100 +TOOLS TO PARSE LIVE OBJECT LOGS 1.101 +=============================== 1.102 + 1.103 +This directory is meant to house the tools that you can use to parse 1.104 +live-object logs. 1.105 + 1.106 +Object Histograms - histogram.pl 1.107 +-------------------------------- 1.108 + 1.109 +This program parses a `live object' dump and produces a histogram of 1.110 +the objects, sorted from objects that take the most memory to objects 1.111 +that take the least. The output of this program is rather spartan: on 1.112 +each line, it prints the object type, the number of objects of that 1.113 +type, and the total number of bytes that the objects consume. 1.114 + 1.115 +There are a two simple programs to `pretty print' the output from 1.116 +histogram.pl: 1.117 + 1.118 + 1. histogram-pretty.sh takes a single histogram and produces a table 1.119 + of objects. 1.120 + 1.121 + Type Count Bytes %Total 1.122 + TOTAL 67348 4458127 100.00 1.123 + nsImageGTK 76 679092 15.23 1.124 + void* 8956 563572 12.64 1.125 + ... 1.126 + PRLock 732 61488 1.38 1.127 + OTHER 24419 940235 21.09 1.128 + 1.129 + 2. histogram-diff.sh takes two histograms and computes the difference 1.130 + between them. 1.131 + 1.132 + ---- Base ---- ---- Incr ---- ----- Difference ---- 1.133 + Type Count Bytes Count Bytes Count Bytes %Total 1.134 + TOTAL 40241 1940945 73545 5315142 33304 3374197 100.00 1.135 + nsImageGTK 16 106824 151 832816 135 725992 21.52 1.136 + PresShell 16 51088 198 340706 182 289618 8.58 1.137 + ... 1.138 + OTHER 27334 1147033 38623 1493385 11289 346352 10.26 1.139 + 1.140 +Both of these scripts accept `-c' parameter that specifies how many 1.141 +rows you'd like to see (by default, twenty). Any rows past the first 1.142 +`n' rows are lumped into a single `OTHER' row. This allows you to keep 1.143 +your reports short n' sweet. 1.144 + 1.145 +Stack-based Type Inference - types.dat 1.146 +-------------------------------------- 1.147 + 1.148 +Trace malloc uses `speculative RTTI' to determine the types of objects 1.149 +as it dumps them. Unfortunately, RTTI can only deduce the type name 1.150 +for C++ objects with a virtual destructor. 1.151 + 1.152 +This leaves: 1.153 + 1.154 + . C++ object without a virtual destructor 1.155 + . array allocated C++ objects, and 1.156 + . objects allocated with the C runtime function (malloc 1.157 + and friends) 1.158 + 1.159 +out in the cold. Trace malloc reports objects allocated this was as 1.160 +having type `void*'. 1.161 + 1.162 +The good news is that you can almost always determine the object's 1.163 +type by looking at the stack trace that's taken at the time the object 1.164 +is allocated. 1.165 + 1.166 +The file `types.dat' consists of rules to classify objects based on 1.167 +stack trace. 1.168 + 1.169 + 1.170 +Uncategorized Objects - uncategorized.pl 1.171 +---------------------------------------- 1.172 + 1.173 +Categorizing objects in `types.dat' is sweaty work, and the 1.174 +`uncategorized.pl' script is a tool that makes it a bit 1.175 +easier. Specifically, it reads a `live object' dump file and sorts the 1.176 +stack traces. Stack traces that account for the most uncategorized 1.177 +objects are placed first. 1.178 + 1.179 +Using this tool, you can add the `most effective' rules to 1.180 +`types.dat': rules that account for most of the uncategorized data.