SRDB ID   Synopsis   Date
48187   Sun Fire[TM] 12K/15K: Dstop: Steering bus A input parity error   31 Oct 2002

Status Issued

Description
- Problem Statement:

	Dstop: Steering bus A input parity error

- Symptoms:

	'wfail' output reports something similar to the following:

	   01  redxl> dumpf load dsmd.dstop.020410.1454.52
	   02  Created Wed Apr 10 14:54:53 2002
	   03  By hpost v. 1.2 Generic 112488-03 Feb 15 2002 13:40:50  executing as pid=26063
	   04  On ssc name =  f15k-02-sc0-hme0.
	   05  Domain =  0=A = omis320    Platform = f15k-02
	   06  Boards in dump: master SC    CPs/CSBs[1:0]: 3
	   07          EXB[17:0]: 3FFFF
	   08        Slot0[17:0]: 3FFFF
	   09        Slot1[17:0]: 3FFFF
	   10  -D option, -d
	   11  "DSMD DomainStop Dump"
	   12  0 errors occurred while creating this dump.
	   13  redxl> wfail
	   14  SDI EX00/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   15  SDI EX01/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   16  SDI EX02/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   17  SDI EX03/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   18  SDI EX04/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   19  SDI EX05/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   20  SDI EX06/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   21  SDI EX07/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   22  SDI EX08/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   23  SDI EX09/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   24  SDI EX10/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   25  SDI EX11/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   26  SDI EX12/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   27  SDI EX13/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   28  SDI EX14/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   29  SDI EX15/S0  Dstop1[31:0] = 00088008
	   30          Dstop1[19]: D 1E SDI Slave 2 requested all Dstop
	   31  SDI EX15/S0  Master_Stop_Status0[31:0] = 3004000F
	   32          MStop0[3:0]: All SDI logic is DStopped + Recordstopped.
	   33  SDI EX15/S0  Dstop0[31:0] = 00010001
	   34          Dstop0[16]: D    DARB texp requests all Dstop (M)
	   35  SDI EX15/S2  Master_Stop_Status0[31:0] = 00000008
	   36          MStop0[3]: SDI is Recordstopped
	   37  SDI EX15/S2  Dstop0[31:0] = 00088008
	   38          Dstop0[19]: D 1E SDI internal core requested Dstop
	   39  SDI EX15/S2  Core_Error0[31:0]  = 00108010  Mask = 7FE8FFFF
	   40          CoreErr0[20]: D 1E Steering bus A input parity error (S)
	   41              {steera_parin,steera_in[32:0]} = 0.00000020
	   42  FAIL EXB EX15:  Dstop/Rstop detected by SDI EX15/S2.
	   43  Primary service FRU is EXB EX15.
	   44  SDI EX16/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   45  SDI EX17/S0: All SDI is DStopped and RStopped,         requested by DARB.
	   46  DARB C0: enabled ports (expanders)          [17:0]: 3FFFF
	   47  DARB C0: other darb req Dstop+Rstop for exps[17:0]: 08000
	   48  DARB C1: enabled ports (expanders)          [17:0]: 3FFFF
	   49  DARB C1: other darb req Dstop+Rstop for exps[17:0]: 08000            

SOLUTION SUMMARY:
- Troubleshooting:

	The dump header tells us that this Dstop was generated by dsmd (lines 10,11) while a 
	domain was active. This is also evident by the dumpf file name.  dsmd.dstop files are 
	created by dsmd as part of an ASR. Walking the error chain: 

	 - EX15 is the first error in the domain. The slave SDI2 requests the Dstop (line 30). 
	 - The specific errors in SDI2 are reported next. We have a Steering bus A input parity 
	   error (lines 39-41). 
	 - All other expanders error free. We can quickly determine this because these expanders 
	   only have a single line of output in wfail. 
	 - wfail then informs us to FAIL EX15 (lines 42-43) as the primary FRU. 

	The steering busses direct data flow through the SDI. Steering is generated in the Master 
	SDI and driven to the slave SDIs. The steering tells the SDIs where to look for the next 
	transfer of data. For example, if the centerplane wants to transfer to Slot 0, steering 
	tells the Slot 0 port of the SDIs to take data from the Centerplane. Referring back to 
	the wfail output, EX15/S0 is our Master SDI and EX15/S2 is the slave SDI reporting the 
	error. Thus, the parity error occurred between EX15/S0 and EX15/S2. Since the steering bus 
	is completely contained within the expander, EX15 is the faulty FRU.

- Resolution:

	Replace the Expander reporting the steering parity error. In this example, replace EX15.

- Summary of part number and patch ID's 

	501-5179 Expander
	
- References and bug IDs

	SunSolve Article 48122	

- Additional background information:


	Looking deeper, the history of the SDIs can be examined further to illustrate the parity 
	error. Let's start with the steering history on EX15/S0: 

	   50  redxl> shsdi 15 0 steera
	   51  Note: Data is displayed from the currently loaded dump file.
	   52  SDI EX15/S0    Output history of Steer A
	   53  <----- STEERA ---->
	   54      STEERA    STOP
	   55     [32:0] P  DEMA P  entry
	   56  1FFFFFFDF 1    1  1   0  old
	   57  1FFFFFF9F 0    1  1   1
	   58  1FFFFFFDF 1    1  1   2
	   59  1FFFFFF9F 0    1  1   3
	   60  1FFFFFFDF 1    1  1   4
	   61  1FFFFFF9F 0    1  1   5
	   62  1FFFFFFDF 1    1  1   6
	   63  1FFFFFF9F 0    1  1   7
	   64  1FFFFFFDF 1    1  1   8
	   65  1FFFFFF9F 0    1  1   9
	   66  1FFFFFFDF 1    1  1   10
	   67  1FFFFFF9F 0    1  1   11
	   68  1FFFFFFDF 1    1  1   12
	   69  1FFFFFF9F 0    1  1   13
	   70  1FFFFFFDF 1    1  1   14
	   71  1FFFFFF9F 0    1  1   15
	   72  1FFFFFFDF 1    1  1   16
	   73  1FFFFFF9F 0    1  1   17
	   74  1FFFFFFDF 1    1  1   18
	   75  1FFFFFF9F 0    1  1   19
	   76  1FFFFFFDF 1    1  1   20
	   77  1FFFFFF9F 0    1  1   21
	   78  1FFFFFFDF 1    1  1   22
	   79  1FFFFFF9F 0    1  1   23
	   80  1FFFFFFDF 1    1  1   24
	   81  1FFFFFF9F 0    1  1   25
	   82  1FFFFFFDF 1    0  0   26<
	   83  1FFFFFF9F 0    1  1   27
	   84  1FFFFFFDF 1    1  1   28
	   85  1FFFFFF9F 0    1  1   29
	   86  1FFFFFFDF 1    1  1   30
	   87  1FFFFFF9F 0    1  1   31  new

	The cycle of interest is cycle 26 (line 82) and tagged by a <, where we have a steering 
	value of 1FFFFFFDF a parity of 1. The steering busses are protected by even parity, so 
	already we've got a disconnect. 1FFFFFFDF has 32 1's. Parity should be a zero. Now for 
	the steering history on EX15/S2: 

	    88  redxl> shsdi 15 2 steera
	    89  Note: Data is displayed from the currently loaded dump file.
	    90  SDI EX15/S2    Output history of Steer A
	    91  <----- STEERA ---->
	    92      STEERA    STOP
	    93     [32:0] P  DEMA P  entry
	    94  1FFFFFFFF 1    1  1   0  old
	    95  1FFFFFFFF 1    1  1   1
	    96  1FFFFFFFF 1    1  1   2
	    97  1FFFFFFFF 1    1  1   3
	    98  1FFFFFFFF 1    1  1   4
	    99  1FFFFFFFF 1    1  1   5
	   100  1FFFFFFFF 1    1  1   6
	   101  1FFFFFFFF 1    1  1   7
	   102  1FFFFFFFF 1    1  1   8
	   103  1FFFFFFFF 1    1  1   9
	   104  1FFFFFFFF 1    1  1   10
	   105  1FFFFFFFF 1    1  1   11
	   106  1FFFFFFFF 1    1  1   12
	   107  1FFFFFFFF 1    1  1   13
	   108  1FFFFFFFF 1    1  1   14
	   109  1FFFFFFFF 1    1  1   15
	   110  1FFFFFFFF 1    1  1   16
	   111  1FFFFFFFF 1    1  1   17
	   112  1FFFFFFFF 1    1  1   18
	   113  1FFFFFFFF 1    1  1   19
	   114  1FFFFFFFF 1    1  1   20
	   115  1FFFFFFFF 1    1  1   21
	   116  1FFFFFFFF 1    1  1   22
	   117  1FFFFFFFF 1    1  1   23
	   118  1FFFFFFFF 1    1  1   24
	   119  1FFFFFFFF 1    1  1   25
	   120  1FFFFFFFF 1    1  1   26<
	   121  1FFFFFFFF 1    1  1   27
	   122  1FFFFFFFF 1    1  1   28
	   123  1FFFFFFFF 1    1  1   29
	   124  1FFFFFFFF 1    1  1   30
	   125  1FFFFFFFF 1    1  1   31  new

	On cycle 26 (line 120), all values are high. The steering value is 1FFFFFFFF with 
	a parity of 1. This parity is correct. Comparing 1FFFFFFDF (EX15/S0) to this, bit 5 
	is flipped. This bit flip is seen in SDI2 (line 137) by an XOR of the steering 
	histories on that cycle. 

	   126  redxl> shsdi -e 15 2
	   127  Note: Data is displayed from the currently loaded dump file.
	   128  SDI EX15/S2    Component ID = 64317049
	   129           Master_Stop_Status0[31:0] = 00000008
	   130          MStop0[3]: SDI is Recordstopped
	   131           Master_Stop_Status1[31:0] = 7F7F0000
	   132           Dstop0[31:0] = 00088008
	   133          Dstop0[19]: D 1E SDI internal core requested Dstop
	   134           Recordstop0[31:0]  = 00000000
	   135           Core_Error0[31:0]  = 00108010  Mask = 7FE8FFFF
	   136          CoreErr0[20]: D 1E Steering bus A input parity error (S)
	   137              {steera_parin,steera_in[32:0]} = 0.00000020
	   138              Core_ErrData[4:2][31:0]  = 00000000 00080600 00000020
	   139              Core_ErrData[1:0][31:0]  = 00000002 02DD3000
	   140           Core_Error1[31:0]  = 00000000  Mask = FFFFFFFF
	   141           CP_Error0[31:0]    = 00000000  Mask = 7F3F67FF
	   142           Slot0_Error0[31:0] = 00000000  Mask = 703FFFFF
	   143           Slot0_Error1[31:0] = 00000000  Mask = FFFF4FFF
	   144           Slot0_Error2[31:0] = 00000000  Mask = FFFFFFFF
	   145           Slot1_Error0[31:0] = 00000000  Mask = 703FFFFF
	   146           Slot1_Error1[31:0] = 00000000  Mask = FFFF4FFF
	   147           Slot1_Error2[31:0] = 00000000  Mask = FFFFFFFF

	We also saw this in the initial wfail (line 41). 

	As an aside, a steering state of all 1's is the idle state for the bus. If we look 
	at the steering history for the remaining SDIs on EX15 (left to an exercise for the 
	reader), we'd see that all of the other slave SDIs are in the idle state.

- Meta-Data/Problem categorization:

Product/Platform: SF12K/SF15K
Category:

- Keywords

15K, 12K, SF15K, SF12K, starcat, dstop, Steering bus A input parity error            

INTERNAL SUMMARY:

SUBMITTER: Scott Davenport APPLIES TO: Hardware/Sun Fire /15000, Hardware/Sun Fire /12000 ATTACHMENTS:


Copyright (c) 1997-2003 Sun Microsystems, Inc.