Post-Mortem Debugging
When kadb is running and the system panics, control is passed to the debugger so that you can investigate the source of the problem. However, kadb is not always the best tool for problem analysis; frequently it is easier to use ':c' to continue execution and allow the system to save a crash dump. When the system reboots, you can perform post-mortem analysis on the saved crash dump. This process is analogous to debugging an application crash from a process core file.
Post-mortem analysis offers several advantages to driver developers: it allows more than one developer to examine a problem in parallel; it allows developers to retrieve information on a problem that occurred in production at a customer site, where it is not acceptable to debug interactively; it is necessary to perform certain types of advanced kernel analysis, such as checking for kernel memory leaks.
Getting Started With the Modular Debugger
The modular debugger, mdb, provides sophisticated debugging support for analyzing kernel problems. This section provides an overview of mdb features. For a more complete discussion of mdb, refer to the Solaris Modular Debugger Guide.
mdb command syntax is compatible with the kadb syntax and mdbcan execute all of the kadb (and legacy adb) macros. These are stored in /usr/lib/adb and in /usr/platform/`uname -i`/lib/adb for 32-bit kernels; and in /usr/lib/adb/sparcv9 and /usr/platform/`uname -i`/lib/adb/sparcv9 for 64-bit kernels.
In addition to macro files, mdb supports debugger commands (or dcmds). These dcmds can be dynamically loaded at runtime from a set of debugger modules. mdb provides a first-class programming API for implementing debugger modules so that driver developers can implement their own custom debugging support. mdb also provides a host of usability features, such as command line editing, command history, an output pager, and online help.
mdb provides a rich set of modules and dcmds for debugging the Solaris kernel and associated modules and device drivers. Some of the activities these facilities enable you to do include:
formulate complex debugging queries
locate all the memory allocated by a particular thread
print a visual picture of a kernel STREAM
determine what type of structure a particular address refers to
locate leaked memory blocks in the kernel
analyze memory to locate stack traces
Note - In earlier versions of the Solaris operating environment, adb(1) was the recommended tool for post-mortem analysis. In the Solaris 9 operating environment, mdb(1) is the recommended tool for post-mortem analysis. It provides an upward-compatible syntax and feature set that surpass the set of commands available from the legacy crash(1M) utility, which has been removed from Solaris 9.
To get started, type mdb and supply it with a system crash dump:
% cd /var/crash/testsystem % ls bounds unix.0 vmcore.0 % mdb unix.0 vmcore.0 Loading modules: [ unix krtld genunix ufs_log ip usba s1394 cpc nfs ] > ::status debugging crash dump vmcore.0 (64-bit) from testsystem operating system: 5.9 Generic (sun4u) panic message: zero dump content: kernel pages only |
When mdb responds with the '>' prompt, it is ready for commands. To examine the running kernel on a live system, type:
# mdb -k Loading modules: [ unix krtld genunix ufs_log ip usba s1394 ptm cpc ipc nfs ] > ::status debugging live kernel (64-bit) on testsystem operating system: 5.9 Generic (sun4u) |
Important mdb Commands
This section provides a tutorial for some of the mdb debugger commands most applicable to driver authors. Note that the information presented here is dependent on the type of system used. A Sun Blade 100 workstation running the 64-bit kernel was used to produce these examples.
The Solaris Modular Debugger Guide provides details about each debugger command discussed here, as well as more information about all aspects of mdb. Online help is available from within mdb using the ::help built-in command.
Displaying Data Structures with mdb
mdb provides a powerful facility for displaying kernel data structures, so that earlier kadb(1) and mdb(1) debugger macros are no longer needed. Starting in Solaris 9, the operating system kernel maintains a highly compressed database of data structure type information in nonpageable system memory. This means that when a crash occurs, this type information is saved as part of the crash dump.
Here is an example of using the kernel type information to display all of the fields of a scsi_pkt structure:
Example 18-5 Displaying Kernel Information with mdb
Each data structure member is presented along with its type. Nested structures are expanded for easy viewing. ::print can also decode arrays and unions.
It is frequently helpful to discover the size of a particular kernel data structure; doing so is simple, using the ::sizeof dcmd:
> ::sizeof 'struct scsi_pkt' sizeof (struct scsi_pkt) = 0x58 |
You can also locate the offset of a field within a data structure:
> ::offsetof 'struct scsi_pkt' pkt_state offsetof (pkt_state) = 0x48 |
The -a option may be used to view the offset of each member of a data structure; if no address is specified to ::print, the output begins at address 0, providing the offset of each field:
Example 18-6 mdb: Viewing Data Members
The ::print, ::sizeof and ::offsetof facilities make it possible to more rapidly debug problems which arise when your driver interacts with the Solaris kernel.
Caution - This facility provides access to "raw" kernel data structures. You may examine any structure whether it appears as part of the DDI or not; therefore, refrain from relying on any data structure that is not explicitly part of the DDI.
Note - These dcmds may only be used with objects that contain compressed symbolic debugging information designed for use with mdb. This information is currently only available for certain Solaris kernel modules. The SUNWzlib (32-bit) or SUNWzlibx (64-bit) decompression software must be installed in order to process the symbolic debugging information.