diff -r 000000000000 -r 6474c204b198 tools/jprof/README.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/jprof/README.html Wed Dec 31 06:09:35 2014 +0100 @@ -0,0 +1,330 @@ + + + +
+Recent (4/2011) updates Randell Jesup (see bugzilla for contact info) +
+jprof-log
and jprof-map
files. By collecting a large
+number of these call stacks, it is possible to deduce where mozilla is spending
+its time.
+
+Configure your mozilla with jprof support by adding
+--enable-jprof
to your configure options (eg adding
+ac_add_options --enable-jprof
to your .mozconfig
) and
+making sure that you do not have the
+--enable-strip
configure option set -- jprof needs symbols to
+operate. On many architectures with GCC, you'll need to add
+--enable-optimize="-O3 -fno-omit-frame-pointer"
or the
+equivalent to ensure frame pointer generation in the compiler you're using.
Finally, build mozilla with your new configuration. Now you can run jprof.
+ +jprof [-v] [-t] [-e exclude] [-i include] [-s stackdepth] [--last] [--all] [--start n [--end m]] [--output-dir dir] prog log [log2 ...]+Options: +
kill -PROF
, or with kill -ALRM
if
+ JP_REALTIME is used, or with kill -POLL
(also known as kill -IO
) if JP_RTC_HZ is used).
+
+ freq
is the frequency
+ at which the timer should fire, measured in Hz. It must be a power of 2.
+ The maximal frequency allowed by the kernel can be changed by writing to
+ /proc/sys/dev/rtc/max-user-freq
; the maximum value it can be
+ set to is 8192. Note that /dev/rtc
will need to be readable
+ by the Firefox process; making that file world-readable is a simple way to
+ accomplish that.
+
+ + + SIGUSR2 will cause the circular buffer to be cleared. + +
+A build with jprof enabled adds four functions to the Window object:
+JProfStartProfiling()
and JProfStopProfiling()
: When used with JP_DEFER, these
+allow one to start and stop the timer just around whatever critical section is
+being profiled.
+JProfClearCircular()
and JProfSaveCircular()
:
+These clear the circular buffer and save the buffer (without stopping), respectively.
+ setenv JPROF_FLAGS "JP_START JP_FIRST=3 JP_PERIOD=0.025"+ +
+ setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.001"+ +
+ setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.010 JP_REALTIME"+ +
+ setenv JPROF_FLAGS "JP_DEFER JP_RTC_HZ=8192"+ +
+ setenv JPROF_FLAGS "JP_DEFER JP_CIRCULAR=1048576 JP_PERIOD=0.001"+ +
jprof can be paused at any time by sending a SIGUSR1 to mozilla (kill
+-USR1
). This will cause the timer signals to stop and jprof-map to be
+written, but it will not close jprof-log. Combining SIGUSR1 with the JP_DEFER
+option allows profiling of one sequence of actions by starting the timer right
+before starting the actions and stopping the timer right afterward.
+
+
After a SIGUSR1, sending another timer signal (SIGPROF, SIGALRM, or SIGPOLL (aka SIGIO), +depending on the mode) can be used to continue writing data to the same +output. + +
SIGUSR2 will cause the circular buffer to be cleared, if it's in use. +This is useful right before running a test when you're using a large, +continuous circular buffer, or programmatically at the start of an action +which might take too long (JProfClearCircular()). + +
jprof-log
and jprof-map
files, we
+can use the jprof executable is used to turn them into readable output. To do
+this jprof needs the name of the mozilla binary and the log file. It deduces
+the name of the map file:
+
++ ./jprof /home/user/mozilla/objdir/dist/bin/firefox ./jprof-log > tmp.html ++ +This will generate the file
tmp.html
which you should view in a
+web browser.
+
++ ./jprof --output-dir=/tmp /home/user/mozilla/objdir/dist/bin/firefox ./jprof-log* ++ +This will generate a set of files in /tmp for each process. + + +
+ +This shows that of the 151603 times the timer fired, 1777 (1.2% of the total) were inside nsStyleContext::GetStyleData() and 1552 (1.0% of the total) were in the nsCOMPtr_base destructor. + ++Total hit count: 151603 +Count %Total Function Name + +8806 5.8 __libc_poll +2254 1.5 __i686.get_pc_thunk.bx +2053 1.4 _int_malloc +1777 1.2 nsStyleContext::GetStyleData(nsStyleStructID) +1600 1.1 __libc_malloc +1552 1.0 nsCOMPtr_base::~nsCOMPtr_base() +
+In general, the functions with the highest count are the functions which +are taking the most time. + +
+The function names are linked to the entry for that function in the +hierarchical profile, which is described in the next section. + +
+ +The information this block tells us is: + ++ index Count Hits Function Name + 545 (46.4%) nsBlockFrame::ReflowInlineFrames(nsBlockReflowState&, nsLineList_iterator, int*) + 100 (8.5%) nsBlockFrame::ReflowDirtyLines(nsBlockReflowState&) + 72870 4 (0.3%) 645 (54.9%) nsBlockFrame::DoReflowInlineFrames(nsBlockReflowState&, nsLineLayout&, nsLineList_iterator, nsFlowAreaRect&, int&, nsFloatManager::SavedState*, int*, LineReflowStatus*, int) + 545 (46.4%) nsBlockFrame::ReflowInlineFrame(nsBlockReflowState&, nsLineLayout&, nsLineList_iterator, nsIFrame*, LineReflowStatus*) + 83 (7.1%) nsBlockFrame::PlaceLine(nsBlockReflowState&, nsLineLayout&, nsLineList_iterator, nsFloatManager::SavedState*, nsRect&, int&, int*) + 9 (0.8%) nsLineLayout::BeginLineReflow(int, int, int, int, int, int) + 1 (0.1%) nsTextFrame::GetType() const + 1 (0.1%) nsLineLayout::RelativePositionFrames(nsOverflowAreas&) + 1 (0.1%) __i686.get_pc_thunk.bx + 1 (0.1%) PL_ArenaAllocate +
nsBlockFrame::DoReflowInlineFrames
+nsBlockFrame::DoReflowInlineFrames
. Of these:
+nsBlockFrame::ReflowInlineFrame
+ nsBlockFrame::PlaceLine
+ nsLineLayout::BeginLineReflow
+ nsTextFrame::GetType
+ nsLineLayout::RelativePositionFrames
+ __i686.get_pc_thunk.bx
+ PL_ArenaAllocate
+nsBlockFrame::DoReflowInlineFrames
:
+nsBlockFrame::ReflowInlineFrames
+ nsBlockFrame::ReflowDirtyLines
+This block corresponds to the function nsBlockFrame::DoReflowInlineFrames
, which is
+therefore bolded and not a link. The name of this function is preceded by
+five numbers which have the following meaning. The number on the left (72870)
+is the index number, and is not important. The next number (4) and the
+percentage following (0.3%) are the number
+of times this function was interrupted by the timer and the percentage of
+the total hits that is. The last number pair ("645 (54.9%)")
+are the number of times this function was in the call stack when the timer went
+off. That is, the timer went off while we were in code that was ultimately
+called from nsBlockFrame::DoReflowInlineFrames
.
+
For our example we can see that our function was in the call stack for +645 interrupt ticks, but we were only the function that was running when +the interrupt arrived 4 times. +
+The functions listed above the line for nsBlockFrame::DoReflowInlineFrames
are its
+callers. The numbers to the left of these function names are the numbers of
+times these functions were in the call stack as callers of
+nsBlockFrame::DoReflowInlineFrames
. In our example, we were called 545 times by
+nsBlockFrame::ReflowInlineFrames
and 100 times by
+nsBlockFrame::ReflowDirtyLines
.
+
+The functions listed below the line for nsBlockFrame::DoReflowInlineFrames
are its
+callees. The numbers to the left of the function names are the numbers of
+times these functions were in the callstack as callees of
+nsBlockFrame::DoReflowInlineFrames
and the corresponding percentages. In our example, of the 645 profiler hits under nsBlockFrame::DoReflowInlineFrames
545 were under nsBlockFrame::ReflowInlineFrame
, 83 were under nsBlockFrame::PlaceLine
, and so forth.
+ +NOTE: If there are loops of execution or recursion, the numbers will +not add up and percentages can exceed 100%. If a function directly calls +itself "(self)" will be appended to the line, but indirect recursion will +not be marked. + +
+If you get an error:
Inconsistency detected by ld.so: dl-open.c: 260: dl_open_worker: Assertion `_dl_debug_initialize (0, args->nsid)->r_state == RT_CONSISTENT' failed!
+
that means you've hit a timing hole in the version of glibc you're +running. See Redhat bug 4578. + + + +