michael@0: ============== michael@0: Crash Reporter michael@0: ============== michael@0: michael@0: Overview michael@0: ======== michael@0: michael@0: The **crash reporter** is a subsystem to record and manage application michael@0: crash data. michael@0: michael@0: While the subsystem is known as *crash reporter*, it helps to think of michael@0: it more as a *process dump manager*. This is because the heart of this michael@0: subsystem is really managing process dump files and these files are michael@0: created not only from process crashes but also from hangs and other michael@0: exceptional events. michael@0: michael@0: The crash reporter subsystem is composed of a number of pieces working michael@0: together. michael@0: michael@0: Breakpad michael@0: Breakpad is a library and set of tools to make collecting process michael@0: information (notably dumps from crashes) easy. Breakpad is a 3rd michael@0: party project (originaly developed by Google) that is imported into michael@0: the tree. michael@0: michael@0: Dump files michael@0: Breakpad produces files called *dump files* that hold process data michael@0: (stacks, heap data, etc). michael@0: michael@0: Crash Reporter Client michael@0: The crash reporter client is a standalone executable that is launched michael@0: to handle dump files. This application optionally submits crashes to michael@0: Mozilla (or the configured server). michael@0: michael@0: How Main-Process Crash Handling Works michael@0: ===================================== michael@0: michael@0: The crash handler is hooked up very early in the Gecko process lifetime. michael@0: It all starts in ``XREMain::XRE_mainInit()`` from ``nsAppRunner.cpp``. michael@0: Assuming crash reporting is enabled, this startup function registers an michael@0: exception handler for the process and tells the crash reporter subsystem michael@0: about basic metadata such as the application name and version. michael@0: michael@0: The registration of the crash reporter exception handler doubles as michael@0: initialization of the crash reporter itself. This happens in michael@0: ``CrashReporter::SetExceptionHandler()`` from ``nsExceptionHandler.cpp``. michael@0: The crash reporter figures out what application to use for reporting michael@0: dumped crashes and where to store these dump files on disk. The Breakpad michael@0: exception handler (really just a mechanism for dumping process state) is michael@0: initialized as part of this function. The Breakpad exception handler is michael@0: a ``google_breakpad::ExceptionHandler`` instance and it's stored as michael@0: ``gExceptionHandler``. michael@0: michael@0: As the application runs, various other systems may write *annotations* michael@0: or *notes* to the crash reporter to indicate state of the application, michael@0: help with possible reasons for a current or future crash, etc. These are michael@0: performed via ``CrashReporter::AnnotateCrashReport()`` and michael@0: ``CrashReporter::AppendAppNotesToCrashReport()`` from michael@0: ``nsExceptionHandler.h``. michael@0: michael@0: For well running applications, this is all that happens. However, if a michael@0: crash or similar exceptional event occurs (such as a hang), we need to michael@0: write a crash report. michael@0: michael@0: When an event worthy of writing a dump occurs, the Breakpad exception michael@0: handler is invoked and Breakpad does its thing. When Breakpad has michael@0: finished, it calls back into ``CrashReporter::MinidumpCallback()`` from michael@0: ``nsExceptionHandler.cpp`` to tell the crash reporter about what was michael@0: written. michael@0: michael@0: ``MinidumpCallback()`` performs a number of actions once a dump has been michael@0: written. It writes a file with the time of the crash so other systems can michael@0: easily determine the time of the last crash. It supplements the dump michael@0: file with an *extra* file containing Mozilla-specific metadata. This data michael@0: includes the annotations set via ``CrashReporter::AnnotateCrashReport()`` michael@0: as well as time since last crash, whether garbage collection was active at michael@0: the time of the crash, memory statistics, etc. michael@0: michael@0: If the *crash reporter client* is enabled, ``MinidumpCallback()`` invokes michael@0: it. It simply tries to create a new *crash reporter client* process (e.g. michael@0: *crashreporter.exe*) with the path to the written minidump file as an michael@0: argument. michael@0: michael@0: The *crash reporter client* performs a number of roles. There's a lot going michael@0: on, so you may want to look at ``main()`` in ``crashreporter.cpp``. First, michael@0: it verifies the dump data is sane. If it isn't (e.g. required metadata is michael@0: missing), the dump data is ignored. If dump data looks sane, the dump data michael@0: is moved into the *pending* directory for the configured data directory michael@0: (defined via the ``MOZ_CRASHREPORTER_DATA_DIRECTORY`` environment variable michael@0: or from the UI). Once this is done, the main crash reporter UI is displayed michael@0: via ``UIShowCrashUI()``. The crash reporter UI is platform specific: there michael@0: are separate versions for Windows, OS X, and various \*NIX presentation michael@0: flavors (such as GTK). The basic gist is a dialog is displayed to the user michael@0: and the user has the opportunity to submit this dump data to a remote michael@0: server. michael@0: michael@0: If a dump is submitted via the crash reporter, the raw dump files are michael@0: removed from the *pending* directory and a file containing the michael@0: crash ID from the remote server for the submitted dump is created in the michael@0: *submitted* directory. michael@0: michael@0: If the user chooses not to submit a dump in the crash reporter UI, the dump michael@0: files are deleted. michael@0: michael@0: And that's pretty much what happens when a crash/dump is written! michael@0: michael@0: Plugin and Child Process Crashes michael@0: ================================ michael@0: michael@0: Crashes in plugin and child processes are also managed by the crash michael@0: reporting subsystem. michael@0: michael@0: Child process crashes are handled by the ``mozilla::dom::CrashReporterParent`` michael@0: class defined in ``dom/ipc``. When a child process crashes, the toplevel IPDL michael@0: actor should check for it by calling TakeMinidump in its ``ActorDestroy`` michael@0: Method: see ``mozilla::plugins::PluginModuleParent::ActorDestroy`` and michael@0: ``mozilla::plugins::PluginModuleParent::ProcessFirstMinidump``. That method michael@0: is responsible for calling michael@0: ``mozilla::dom::CrashReporterParent::GenerateCrashReportForMinidump`` with michael@0: appropriate crash annotations specific to the crash. All child-process michael@0: crashes are annotated with a ``ProcessType`` annotation, such as "content" or michael@0: "plugin". michael@0: michael@0: Submission of child process crashes is handled by application code. This michael@0: code prompts the user to submit crashes in context-appropriate UI and then michael@0: submits the crashes using ``CrashSubmit.jsm``. michael@0: michael@0: Flash Process Crashes michael@0: ===================== michael@0: michael@0: On Windows Vista+, the Adobe Flash plugin creates two extra processes in its michael@0: Firefox plugin to implement OS-level sandboxing. In order to catch crashes in michael@0: these processes, Firefox injects a crash report handler into the process using the code at ``InjectCrashReporter.cpp``. When these crashes occur, the michael@0: ProcessType=plugin annotation is present, and an additional annotation michael@0: FlashProcessDump has the value "Sandbox" or "Broker". michael@0: michael@0: Plugin Hangs michael@0: ============ michael@0: michael@0: Plugin hangs are handled as crash reports. If a plugin doesn't respond to an michael@0: IPC message after 60 seconds, the plugin IPC code will take minidumps of all michael@0: of the processes involved and then kill the plugin. michael@0: michael@0: In this case, there will be only one .ini file with the crash report metadata, michael@0: but there will be multiple dump files: at least one for the browser process and michael@0: one for the plugin process, and perhaps also additional dumps for the Flash michael@0: sandbox and broker processes. All of these files are submitted together as a michael@0: unit. Before submission, the filenames of the files are linked: michael@0: michael@0: - **uuid.ini** - *annotations, includes an additional_minidumps field* michael@0: - **uuid.dmp** - *plugin process dump file* michael@0: - **uuid-.dmp** - *other process dump file as listed in additional_minidumps* michael@0: michael@0: Browser Hangs michael@0: ============= michael@0: michael@0: There is a feature of Firefox that will crash Firefox if it stops processing michael@0: messages after a certain period of time. This feature doesn't work well and is michael@0: disabled by default. See ``xpcom/threads/HangMonitor.cpp``. Hang crashes michael@0: are annotated with ``Hang=1``. michael@0: michael@0: about:crashes michael@0: ============= michael@0: michael@0: If the crash reporter subsystem is enabled, the *about:crashes* michael@0: page will be registered with the application. This page provides michael@0: information about previous and submitted crashes. michael@0: michael@0: It is also possible to submit crashes from *about:crashes*.