SRDB ID | Synopsis | Date | ||
48138 | Sun Fire[TM] 15K: Identifying and recovering from a domain hang | 29 Oct 2002 |
Status | Issued |
Description |
How to identify and recover from a hung domain.
SOLUTION SUMMARY:
There are several tools to use when trying to determine if a domain is hung. If a domain doesn't respond to the following commands from the System Controller, that is a good indication that the domain is hung. sc0:sms-svc:1> ping domain-a sc0:sms-svc:2> telnet domain-a sc0:sms-svc:3> console -d a Now that you have established the likelihood that the domain is hung, the following steps can be used to return to a 'Running Solaris' state. 1. From the System Controller, connect to the domain through the console command. sc0:sms-svc:1> console -d a Even though there won't be any response or activity we can still send a break sequence (~#) that will drop the OS to the OK> prompt, effectively a Stop-A. Once at the OK> prompt the sync command will try to generate a system dump file and reboot the domain. 2. If a break sequence at the console is insufficient to regain control of the domain, the reset command from the System Contoller can be tried. This is hard on the OS and will more than likely require a fsck to boot the domain into multiuser mode. sc0:sms-svc:1> reset -d a If that still doesn't work, try: sc0:sms-svc:2> reset -d a -x It may take several seconds for the OK> prompt to appear after issuing this command. Once at the OBP, be sure to bring the domain back up with the sync command so that a core file might be generated. 3. Finally, try the keyswitch. sc0:sms-svc:1> setkeyswitch -d a off sc0:sms-svc:2> setkeyswitch -d a on However, this will prevent a core file being generated.
INTERNAL SUMMARY:
I
SUBMITTER: Ryan Crapo APPLIES TO: Hardware/Sun Fire /15000 ATTACHMENTS: