Sun Microsystems, Inc.
spacerspacer
spacer www.sun.com docs.sun.com |
spacer
black dot
 
 
7.  Compiling and Debugging Debugging a Multithreaded Program Common Oversights  Previous   Contents   Next 
   
 
  • Creating a hidden gap in synchronization protection. This is caused when a code segment protected by a synchronization mechanism contains a call to a function that frees and then reacquires the synchronization mechanism before it returns to the caller. The result is that it appears to the caller that the global data has been protected when it actually has not.

  • Mixing UNIX signals with threads--it is better to use the sigwait(2) model for handling asynchronous signals.

  • Using setjmp(3C) and longjmp(3C), and then long-jumping away without releasing the mutex locks.

  • Failing to reevaluate the conditions after returning from a call to *_cond_wait() or *_cond_timedwait().

  • Forgetting that default threads are created PTHREAD_CREATE_JOINABLE and must be reclaimed with pthread_join(3THR); note, pthread_exit(3THR) does not free up its storage space.

  • Making deeply nested, recursive calls and using large automatic arrays can cause problems because multithreaded programs have a more limited stack size than single-threaded programs.

  • Specifying an inadequate stack size, or using nondefault stacks.

And, note that multithreaded programs (especially those containing bugs) often behave differently in two successive runs, given identical inputs, because of differences in the thread scheduling order.

In general, multithreading bugs are statistical instead of deterministic. Tracing is usually a more effective method of finding order of execution problems than is breakpoint-based debugging.

Tracing and Debugging With the TNF Utilities

Use the TNF utilities (included as part of the Solaris system) to trace, debug, and gather performance analysis information from your applications and libraries. The TNF utilities integrate trace information from the kernel and from multiple user processes and threads, and so are especially useful for multithreaded code.

With the TNF utilities, you can easily trace and debug multithreaded programs. See the TNF utilities chapter in the Programming Utilities Guide for detailed information on using prex(1), tnfdump(1), and other TNF utilities.

Using truss(1)

See truss(1) for information on tracing system calls, signals and user-level function calls.

Using mdb(1)

The following mdb commands can be used to access the LWPs of a multithreaded program.

Table 7-3 MT mdb Commands

pid:A

Attaches to process # pid. This stops the process and all its LWPs.

:R

Detaches from process. This resumes the process and all its LWPs.

$L

Lists all active LWPs in the (stopped) process.

n:l

Switches focus to LWP # n.

$l

Shows the LWP currently focused.

num:i

Ignores signal number num.

These commands to set conditional breakpoints are often useful.

Table 7-4 Setting mdb Breakpoints

[label],[count]:b [expression]

Breakpoint is detected when expression equals zero

foo,ffff:b <g7-0xabcdef

Stop at foo when g7 = the hex value 0xABCDEF

Using dbx

With the dbx utility you can debug and execute source programs written in C++, ANSI C, and FORTRAN. dbx accepts the same commands as the Debugger, but uses a standard terminal (TTY) interface. Both dbx and the Debugger support debugging multithreaded programs. For a full overview of dbx and Debugger features see the dbx(1) reference manual page and the Using Sun Workshop user's guide.

All the dbx options listed in Table 7-5 can support multithreaded applications.

Table 7-5 dbx Options for MT Programs

Option

Meaning

cont at line [sig signo id]

Continues execution at line with signal signo. The id, if present, specifies which thread or LWP to continue. The default value is all.

lwp

Displays current LWP. Switches to given LWP [lwpid].

lwps

Lists all LWPs in the current process.

next ... tid

Steps the given thread. When a function call is skipped, all LWPs are implicitly resumed for the duration of that function call. Nonactive threads cannot be stepped.

next ... lid

Steps the given LWP. Does not implicitly resume all LWPs when skipping a function. The LWP on which the given thread is active. Does not implicitly resume all LWP when skipping a function.

step... tid

Steps the given thread. When a function call is skipped, all LWPs are implicitly resumed for the duration of that function call. Nonactive threads cannot be stepped.

step... lid

Steps the given LWP. Does not implicitly resume all LWPs when skipping a function.

stepi... lid

The given LWP.

stepi... tid

The LWP on which the given thread is active.

thread

Displays current thread. Switches to thread tid. In all the following variations, an optional tid implies the current thread.

thread -info [ tid ]

Prints everything known about the given thread.

thread -locks [ tid ]

Prints all locks held by the given thread.

thread -suspend [ tid ]

Puts the given thread into suspended state.

thread -continue [ tid ]

Unsuspends the given thread.

thread -hide [ tid ]

Hides the given (or current) thread. It will not appear in the generic threads listing.

thread -unhide [ tid ]

Unhides the given (or current) thread.

allthread-unhide

Unhides all threads.

threads

Prints the list of all known threads.

threads-all

Prints threads that are not usually printed (zombies).

all|filterthreads-mode

Controls whether threads prints all threads or filters them by default.

auto|manualthreads-mode

Enables automatic updating of the thread listing.

threads-mode

Echoes the current modes. Any of the previous forms can be followed by a thread or LWP ID to get the traceback for the specified entity.

 
 
 
  Previous   Contents   Next