SRDB ID   Synopsis   Date
48485   Sun Fire[TM] 12K/15K: Dstop: Port 8 prerequest not equal predicted value   1 Nov 2002

Status Issued

Description
- Problem Statement:
    
	Dstop: Port 8 prerequest not equal predicted value
        Dstop: Port 8 command valid not equal predicted value


- Symptoms:

       'wfail' output reports something similar to the following:

           01  redxl> dumpf load dsmd.dstop.020510.1632.03
           02  Created Fri May 10 16:32:05 2002
           03  By hpost v. 1.2 Generic 112488-04 Mar 18 2002 14:43:00  executing as pid=5022
           04  On ssc name =  rasputin-sc0.SD_RASCAL.West.Sun.COM
           05  Domain =  0=A    Platform = rasputin
           06  Boards in dump: master SC    CPs/CSBs[1:0]: 3
           07            EXB[17:0]: 12100
           08          Slot0[17:0]: 12100
           09          Slot1[17:0]: 12100
           10  -D option, -d
           11  "DSMD DomainStop Dump"
           12  0 errors occurred while creating this dump.
           13  redxl> wfail
           14  SDI EX08/S0  Master_Stop_Status0[31:0] = 800400CF
           15          MStop0[3:0]: All SDI logic is DStopped + Recordstopped.
           16  SDI EX08/S0  Dstop0[31:0] = 3001B000
           17          Dstop0[16]: D    DARB texp requests all Dstop (M)
           18          Dstop0[28]: D 1E Slot0 asserted Error, enabled to cause Dstop (M)
           19          Dstop0[29]: D 1E Slot1 asserted Error, enabled to cause Dstop (M)
           20  EPLD SB08  Err1_Dom0: Mask= 00  Err= 01  1stErr= 01
           21          Err1[0]:  1E+ Error reported by AR
           22  AR SB8  L2_Check_Err[28:0] = 18189818
           23          L2CErr[ 3,19]:   1E+ Port 8 prerequest not equal predicted value
           24          L2CErr[ 4,20]:   1E+ Port 9 prerequest not equal predicted value
           25          L2CErr[11,27]:   1E+ Port 8 command valid not equal predicted value
           26          L2CErr[12,28]:   1E+ Port 9 command valid not equal predicted value
           27  FAIL Slot SB8:  Dstop detected by AR.
           28  Primary service FRU is Slot SB8.
           29  Secondary service FRU is EXB EX8.
           30  EPLD IO08  Err1_Dom1: Mask= B0  Err= 01  1stErr= 01
           31          Err1[0]:  1E+ Error reported by AR
           32  AR IO8  L2_Check_Err[28:0] = 01808180
           33          L2CErr[ 7,23]:   1E+ Port 8 incoming not equal predicted value
           34          L2CErr[ 8,24]:   1E+ Port 9 incoming not equal predicted value
           35  FAIL Slot IO8:  Dstop detected by AR.
           36  Primary service FRU is Slot IO8.
           37  Secondary service FRU is EXB EX8.
           38  SDI EX13/S0: All SDI is DStopped and RStopped,         requested by DARB.
           39  SDI EX16/S0: All SDI is DStopped and RStopped,         requested by DARB.
           40  DARB C0: enabled ports (expanders)          [17:0]: 16100
           41  DARB C0: other darb req Dstop+Rstop for exps[17:0]: 00100
           42  DARB C1: enabled ports (expanders)          [17:0]: 16100
           43  DARB C1: other darb req Dstop+Rstop for exps[17:0]: 00100
      

SOLUTION SUMMARY:
- Troubleshooting:

	The dump header tells us that this Dstop was generated by dsmd (lines 10,11) 
        while a domain was active. This is also evident by the dump file name - 
        dsmd.dstop files are created by dsmd as part of an ASR. Walking the
        error chain:

         - The SDI on EX8 calls for Dstop as directed by both its Slot 0 and Slot 
           1 boards (lines 18,19).
         - The EPLD on SB8 indicates the AR called for error (line 21).
         - The AR on SB8 reports multiple errors (lines 23-26).
         - SB8 is FAILed from the configuration, SB8 and EX8 are named as primary
           and secondary FRUs (lines 27-29).
         - The EPLD on IO8 indicates the AR called for error (line 31). 
         - The AR on IO8 reports multiple errors (lines 33-34).
         - IO8 is FAILed from the configuration, IO8 and EX8 are named as primary
           and secondary FRUs (lines 35-37).

        Port 8 and 9 of the AR are wired to the AXQ. By design, the AR is supposed 
        to accurately predict when the AXQ will transmit a packet of information. 
        When this does not occur, a Dstop results. Refer to Article 80018 for more 
        details. Since the communication lines between the AR and AXQ cross an interconnect
        the slot board and expander are suspects.

      
- Resolution:

	Since we have both SB8 and IO8 reporting prediction errors at the
        same time (SDI8 reports 1E's for both slots (lines 18,19)), it appears
        that they vindicate one another. It's more logical to start with the
        FRU common to both slot boards, namely the expander.
        
        Replace EX8. If errors persist, replace the slot boards.

        In the general case, as outlined by Article 48146, when a single board
        is reporting prediction errors, hardware actions start with the slot
        board.
        

- Summary of part number and patch ID's 

	http://infoserver.central.sun.com/data/syshbk/Devices/System_Board/SYSBD_SunFire_USIIICu.html
        http://infoserver.central.sun.com/data/syshbk/Devices/I_O/IO_SunFire_15K_hsPCI_IO_Board.html
        http://infoserver.central.sun.com/data/sshandbook/Devices/CPU_Module/UltraSPARC_MaxCPU.html
        501-5179 Expander
      
        
- References and bug IDs

        Knowledge Article 48122 
        Knowledge Article 48146        

- Additional background information:

 	None       
        
- Meta-Data/Problem categorization:

Product/Platform: SF12K/SF15K
Category:

- Keywords

15K, 12K, SF15K, SF12K, Sun Fire 15K, Enterprise, Server, Sun Fire 12K,
starcat, dstop, prerequest not equal predicted value           

INTERNAL SUMMARY:

SUBMITTER: Scott Davenport APPLIES TO: Hardware/Sun Fire /15000, Hardware/Sun Fire /12000 ATTACHMENTS:


Copyright (c) 1997-2003 Sun Microsystems, Inc.