Another useful macro is thread. Given a thread ID, this macro prints the corresponding thread structure. This can be used to look at a certain thread found with the threadlist macro, to look at the owner of a mutex, or to look at the current thread, as shown here:
Example 18-4 The thread Macro
Note - No type information is maintained by kadb, so using a macro on an inappropriate address results in garbage output.
Macros do not necessarily output all the fields of the structures, nor is the output necessarily in the order given in the structure definition. Occasionally, memory needs to be dumped for certain structures and then matched with the structure definition in the kernel header files.
Caution - Drivers should never reference system header files or structures not listed in man pages section 9S: DDI and DKI Data Structures. However, examining non-DDI-compliant structures (such as thread structures) can be useful in debugging drivers.
kadb Output Pager
Some kadb commands (like $<threadlist) output lots of data, which can scroll off of the screen very rapidly. kadb provides a simple output pager to remedy this problem. The pager command is lines::more, where lines represents the number of lines to print before pausing the console output. Keep in mind that this does not take into account lines that wrap because they are wider than the terminal width. Here is an example usage:
kadb[0]: 0t10::more kadb[0]: $<threadlist ============== thread_id 10408000 p0+0x4c0: process args sched t0+0x128: lwp procp wchan 10429ed0 104393e8 0 t0+0x38: pc sp sched+0x4e4 104071f1 ?(10408000,10414c00,2,104393e8,10439308,0) _start(10007588,104292e0,104292e0,104292e0,1043b8b0,10429360) + 200 ============== thread_id 2a10001fd40 p0+0x4c0: process args sched --More-- <SPACE> ... |
Pressing the space bar at the "--More--" prompt pages the output by the number of lines specified to ::more (in this case, 10). Pressing "Return" prints only the next line of output. You can abort the output and return to the kadb prompt by typing Ctrl-C. To disable the pager, issue '0::more' at the kadb prompt.
Example: kadb on a Deadlocked Thread
This example shows how kadb can be used to debug a driver bug. This example was taken from the development of the ramdisk sample driver. This driver exports physical memory as a virtual disk. In this case, the dd(1M) command hangs while trying to copy some data onto the device and cannot be aborted. Though a crash dump could be forced, for illustrative purposes, kadb(1M) will be used. After logging into the system remotely, ps was used to determine that the system was still running; and only the dd(1M) command is hung.
At this point, the system is rebooted with kadb, which can now be entered by typing STOP-A on the system console. After the rest of the kernel has loaded, moddebug is patched to see if loading is the problem:
stopped at: edd000d8: ta %icc,%g0 + 125 kadb[0]: moddebug/X moddebug: moddebug: 0 kadb[0]: moddebug/W 0x80000000 moddebug: 0x0 = 0x80000000 kadb[0]: :c |
modload(1M) is used to load the driver, to separate module loading from the real access:
# modload /home/driver/drv/ramdisk |
It loads without errors, so loading is not the problem. The condition is recreated with dd(1M):
# dd if=/dev/zero of=/devices/pseudo/ramdisk@0:c,raw |
dd(1M) hangs. At this point, kadb(1M) is entered and the stack examined:
stopped at: edd000d8: ta %icc,%g0 + 125 kadb[0]: $c intr_vector() + 7dcfc0d8 debug_enter(0,0,10431e50,10,1,b0) + 78 zsa_xsint(80,7044a06c,44,7044a000,ff0113,0) + 278 zs_high_intr(7044a000,1,1,1042f78c,10424680,100949d0) + 20c sbus_intr_wrapper(704dfad4,0,702bd048,7029cec0,630,10260250) + 30 current_thread(4001fe60,1041a550,10424698,10424698,10150f08,0) + 180 idle(1040b6c0,0,0,1041a550,704d6a98,0) + 54 thread_start(0,0,0,0,0,0) + 4 |
The presence of idle on the current thread stack indicates that this thread is not the cause of the deadlock. To determine the deadlocked thread, the entire thread list is checked:
kadb[0]: $<threadlist ... ============== thread_id 70cef120 70c8b1c0: process args dd if=/dev/zero of=/devices/pseudo/ramdisk@0:c,raw 70cef1c8: lwp procp wchan 70fa9080 70c8aec0 70691fc8 70cef144: pc sp sema_p+0x290 40313a78 ?(70691fc8,10424680,1,1042b99c,10460f8c,70691fc8) biowait(70691f60,1041a6c4,70691f60,70c385d0,40313bcc,705c73a0) + 8c default_physio(1042e8fc,200,129,100,70eb5b54,705c73a0) + 3bc write(2002,70aac1d0,70f9f9ac,200,4,200) + 23c ... |
Of all the threads, only one has a stack trace which references the ramdisk driver. It seems that the process running dd(1M) is blocked in biowait(9F). biowait(9F)'s first parameter is a buf(9S) structure. The next step is to examine this structure:
kadb[0]: 70691f60$70691f60$ 70691f60: flags forw back 204129 0 0 70691f6c: av_forw av_back bcount 0 0 512 70691fa0: bufsize error edev 0 0 1180000 70691f7c: un.b_addr _b_blkno resid 710e8000 0 0 70691f94: proc iodone vp 70c8aec0 0 0 70691f98: pages 0 |
The resid field is 0, which indicates that the transfer is complete. physio(9F) is still blocked, however. The reference for physio(9F) in the Solaris 9 Reference Manual Collection points out that biodone(9F) should be called to unblock biowait(9F). This is the problem; rd_strategy() did not call biodone(9F). Adding a call to biodone(9F) before returning fixes this problem.