Managing System Crash Information (Tasks)
This chapter describes how to manage system crash information in the Solaris environment.
For information on the procedures associated with managing system crash information, see "Managing System Crash Information (Task Map)".
Managing System Crash Information (Task Map)
The following task map identifies the procedures needed to manage system crash information.
Task | Description | For Instructions |
---|---|---|
1. Display the current crash dump configuration | Display the current crash dump configuration by using the dumpadm command. | |
2. Modify the crash dump configuration | Use the dumpadm command to specify the type of data to dump, whether or not the system will use a dedicated dump device, the directory for saving crash dump files, and the amount of space that must remain available after crash dump files are written. | |
3. Examine a crash dump file | Use the mdb command to view crash dump files. | |
4. (Optional) Recover from a full crash dump directory | The system crashes but there is no room in the savecore directory, and you want to save some critical system crash dump information. | "How to Recover From a Full Crash Dump Directory (Optional)" |
5. (Optional) Disable or enable the saving of crash dump files | Use the dumpadm command to disable or enable the saving the crash dump files. Saving crash dump files is enabled by default. |
System Crashes (Overview)
System crashes can occur due to hardware malfunctions, I/O problems, and software errors. If the system crashes, it will display an error message on the console, and then write a copy of its physical memory to the dump device. The system will then reboot automatically. When the system reboots, the savecore command is executed to retrieve the data from the dump device and write the saved crash dump to your savecore directory. The saved crash dump files provide invaluable information to your support provider to aid in diagnosing the problem.
System Crash Dump Files
The savecore command runs automatically after a system crash to retrieve the crash dump information from the dump device and writes a pair of files called unix.X and vmcore.X, where X identifies the dump sequence number. Together, these files represent the saved system crash dump information.
Crash dump files are sometimes confused with core files, which are images of user applications that are written when the application terminates abnormally.
Crash dump files are saved in a predetermined directory, which by default, is /var/crash/hostname. In previous Solaris releases, crash dump files were overwritten when a system rebooted--unless you manually enabled the system to save the images of physical memory in a crash dump file. Now the saving of crash dump files is enabled by default.
System crash information is managed with the dumpadm command. For more information, see "The dumpadm Command".
Saving Crash Dumps
You can examine the control structures, active tables, memory images of a live or crashed system kernel, and other information about the operation of the kernel by using the mdb utility. Using mdb to its full potential requires a detailed knowledge of the kernel, and is beyond the scope of this manual. For information on using this utility, see mdb(1M).
Additionally, crash dumps saved by savecore can be useful to send to a customer service representative for analysis of why the system is crashing.
The dumpadm Command
Use the dumpadm command to manage system crash dump information in the Solaris environment.
The dumpadm command enables you to configure crash dumps of the operating system. The dumpadm configuration parameters include the dump content, dump device, and the directory in which crash dump files are saved.
Dump data is stored in compressed format on the dump device. Kernel crash dump images can be as big as 4 Gbytes or more. Compressing the data means faster dumping and less disk space needed for the dump device.
Saving crash dump files is run in the background when a dedicated dump device, not the swap area, is part of the dump configuration. This means a booting system does not wait for the savecore command to complete before going to the next step. On large memory systems, the system can be available before savecore completes.
System crash dump files, generated by the savecore command, are saved by default.
The savecore -L command is a new feature which enables you to get a crash dump of the live running Solaris operating environment. This command is intended for troubleshooting a running system by taking a snapshot of memory during some bad state, such as a transient performance problem or service outage. If the system is up and you can still run some commands, you can execute the savecore -L command to save a snapshot of the system to the dump device, and then immediately write out the crash dump files to your savecore directory. Because the system is still running, you may only use the savecore -L command if you have configured a dedicated dump device.
The following table describes dumpadm's configuration parameters.
Dump Parameter | Description |
---|---|
dump device | The device that stores dump data temporarily as the system crashes. When the dump device is not the swap area, savecore runs in the background, which speeds up the boot process. |
savecore directory | The directory that stores system crash dump files. |
dump content | Type of data, kernel memory or all of memory, to dump. |
minimum free space | Minimum amount of free space required in the savecore directory after saving crash dump files. If no minimum free space has been configured, the default is one Mbyte. |
For more information, see dumpadm(1M).
The dump configuration parameters managed by the dumpadm command are stored in the /etc/dumpadm.conf file.
Note - Do not /etc/dumpadm.conf edit manually. This could result in an inconsistent system dump configuration.
How the dumpadm Command Works
During system startup, the dumpadm command is invoked by the /etc/init.d/savecore script to configure crash dumps parameters based on information in the /etc/dumpadm.conf file.
Specifically, it initializes the dump device and the dump content through the /dev/dump interface.
After the dump configuration is complete, the savecore script looks for the location of the crash dump file directory by parsing the content of /etc/dumpadm.conf file. Then, savecore is invoked to check for crash dumps. It will also check the content of the minfree file in the crash dump directory.
Dump Devices and Volume Managers
Do not configure a dedicated dump device that is under the control of volume management product such as Solaris Volume Manager for accessibility and performance reasons. You can keep your swap areas under the control of Solaris Volume Manager and this is a recommend practice, but keep your dump device separate.
Managing System Crash Dump Information
Keep the following key points in mind when you are working with system crash information:
You must be superuser to access and manage system crash information.
Do not disable the option of saving system crash dumps. System crash dump files provide an invaluable way to determine what is causing the system to crash.
Do not remove important system crash information until it has been sent to your customer service representative.
How to Display the Current Crash Dump Configuration
Display the current crash dump configuration.
# dumpadm Dump content: kernel pages Dump device: /dev/dsk/c0t3d0s1 (swap) Savecore directory: /var/pluto Savecore enabled: yes
The above example output means:
The dump content is kernel memory pages.
Kernel memory will be dumped on a swap device, /dev/dsk/c0t3d0s1. You can identify all your swap areas with the swap -l command.
System crash dump files will be written in the /var/crash/venus directory.
Saving crash dump files is enabled.