SRDB ID | Synopsis | Date | ||
48194 | Sun Fire[TM] 12K/15K: Dstop: Address parity error | 31 Oct 2002 |
Status | Issued |
Description |
- Problem Statement: Dstop: Address parity error - Symptoms: 'wfail' output reports something similar to the following: 01 redxl> dumpf load dsmd.dstop.020501.2215.20 02 Created Wed May 1 22:15:21 2002 03 By hpost v. 1.2 Generic 112488-03 Feb 15 2002 13:40:50 executing as pid=7003 04 On ssc name = rasputin-sc0.SD_RASCAL.West.Sun.COM 05 Domain = 0=A Platform = rasputin 06 Boards in dump: master SC CPs/CSBs[1:0]: 3 07 EXB[17:0]: 12100 08 Slot0[17:0]: 12100 09 Slot1[17:0]: 12100 10 -D option, -d 11 "DSMD DomainStop Dump" 12 0 errors occurred while creating this dump. 13 redxl> wfail 14 SDI EX08/S0 Master_Stop_Status0[31:0] = 1004004F 15 MStop0[3:0]: All SDI logic is DStopped + Recordstopped. 16 SDI EX08/S0 Dstop0[31:0] = 10019000 17 Dstop0[16]: D DARB texp requests all Dstop (M) 18 Dstop0[28]: D 1E Slot0 asserted Error, enabled to cause Dstop (M) 19 EPLD SB08 Err1_Dom0: Mask= 00 Err= 01 1stErr= 01 20 Err1[0]: 1E+ Error reported by AR 21 AR SB8 PortErr [8][18:0] = 18001 (Expander Board) 22 P8Err[ 0,16]: 1E+ Address parity error 23 FAIL Slot SB8: Dstop detected by AR. 24 Primary service FRU is Slot SB8. 25 Secondary service FRU is EXB EX8. 26 SDI EX13/S0: All SDI is DStopped and RStopped, requested by DARB. 27 SDI EX16/S0: All SDI is DStopped and RStopped, requested by DARB. 28 DARB C0: enabled ports (expanders) [17:0]: 16100 29 DARB C0: other darb req Dstop+Rstop for exps[17:0]: 00100 30 DARB C1: enabled ports (expanders) [17:0]: 16100 31 DARB C1: other darb req Dstop+Rstop for exps[17:0]: 00100
SOLUTION SUMMARY:
- Troubleshooting: The dump header tells us that this Dstop was generated by dsmd (lines 11,12) while a domain was active. This is also evident by the dumpf file name - dsmd.dstop files are created by dsmd as part of an ASR. Walking the error chain: - The SDI on EX8 calls for Dstop as directed by its Slot 0 board, SB8 (line 18) - The EPLD on SB8 indicates the AR asserted error (line 20) - The AR has logged an address parity error (line 22) - SB8 is FAILed from the configuration, but SB8 and EX8 are listed as FRUs (lines 23-25) Upon closer examination of the AR messaging (lines 21-22), we see that port 8 in the AR is reporting the error. Port 8 routes to the expander (as indicated by wfail on line 21), specifically the AXQ. Therefore, the AR on SB8 detected a parity error on an incoming transmission from EX8. The pathway crosses an interconnect, so a single FRU cannot be identified. In the general case, an address parity error may be isolated to just a Slot 0 or Slot 1 board, if the AR port detecting the error were within the board (CPU or IO controller). - Resolution: Repair/replace SB8. If errors persist, replace EX8. For the general case, replace the Slot 0/1 board. If error persists, replace the expander. - Summary of part number and patch ID's http://infoserver.central.sun.com/data/syshbk/Devices/System_Board/SYSBD_SunFire_USIIICu.html http://infoserver.central.sun.com/data/syshbk/Devices/I_O/IO_SunFire_15K_hsPCI_IO_Board.html Expander 501-5179 - References and bug IDs SunSolve Article 48122 - Additional background information - Meta-Data/Problem categorization: Product/Platform: SF12K/SF15K Category: - Keywords 15K, 12K, SF15K, SF12K, starcat, dstop, Address parity error
INTERNAL SUMMARY:
SUBMITTER: Scott Davenport APPLIES TO: Hardware/Sun Fire /15000, Hardware/Sun Fire /12000 ATTACHMENTS: