SRDB ID   Synopsis   Date
28263   Sun Fire (3800-6800) panics with PCI bus errors   15 Aug 2001

Status Issued

Description
This document address problems with the Sun Fire 3800-6800 systems 
where the solution is to replace IO cards.

panic error on a Sun Fire 4800:
WARNING: pcisch-0: PCI fault log start:
PCI SERR
dwordmask=0 bytemask=0
pcisch-0: PCI primary error (0):
pcisch-0: PCI secondary error (0):
pcisch-0: PBM AFAR 0.00000000:pcisch-0: 
PCI config space error status (4280):signalled system error.
pcisch-0: PCI fault log end.

panic[cpu17]/thread=2a100403d40: pcisch-0: PCI bus 2 error(s)!  
                                   
The panic string is not generic: The number of faults on the bus can be greater and the pcisch instance can be different SOLUTION SUMMARY:
The system panicked due to a PCI bus error on pcisch-0.
pcisch-0 refers to the instance 0 of the PCI Schizo driver.                                    
The /etc/path_to_inst should have the real path to the pcisch-0 instance.
# grep pcisch /etc/path_to_inst
"/ssm@0,0/pci@19,600000" 3 "pcisch"
"/ssm@0,0/pci@19,700000" 2 "pcisch"
"/ssm@0,0/pci@1d,700000" 6 "pcisch"
"/ssm@0,0/pci@18,700000" 0 "pcisch"  
"/ssm@0,0/pci@1c,700000" 4 "pcisch"
"/ssm@0,0/pci@1c,600000" 5 "pcisch"
"/ssm@0,0/pci@1d,600000" 7 "pcisch"
"/ssm@0,0/pci@18,600000" 1 "pcisch"
	^^^^^		 ^  ^^^^^^
  path ___|   Instance __|      |_____ Driver


So, the path to access the pcipsh-0 is /ssm@0,0/pci@18,700000
 			                    ^       ^^  ^^^^^         
		         Node Id  __________|        |    |_ Offset       
                             			     |
                                                 Schizo AID                                    
The Node Id is always 0 unless you are on a Wildcat architecture

To find IO board number, divide the Schizo AID 18 by 2, then subtract 6:
(0x18 / 2) - 6 = 6
This device is on IB6. No fractions in the result of the division means 
Schizo 0 of IB6 (if there is a remainder, then it would be Schizo 1).

Then, we need to figure out if the IB6 is a PCI or a cPCI IO Board.
The prtdiag -v output should reveal the answer looking in the "IO cards" 
section for the column "IO Type"

========================= IO Cards =========================

                                Bus  Max                                        
            IO   Port Bus       Freq Bus  Dev,                                  
FRU Name    Type  ID  Side Slot MHz  Freq Func State Name                              Model
----------  ---- ---- ---- ---- ---- ---- ---- ----- --------------------------------  ----------------------
/N0/IB6/P0  PCI   24   B    0    33   33  1,0  ok    pci-pci1011,24.3/pci108e,1000     pci-bridge            
/N0/IB6/P0  PCI   24   B    0    33   3                                    

In our case, we deal with a PCI IO board because the prtdiag -v shows 8 
populated slots on the IB6 and the type is PCI. So now, we can use the 
following charts knowing the Schizo AID, the offset and the type of the 
IO Board to find which [c]PCI slots are involved.

      PCI IO Board Topology:
      ======================

      Even/Odd AID    Offset    device #            Slot assignment
      ------------    ------    --------            --------------------
      1               600000     1                  slot 7 is Schizo1/A
      1               700000     3                  slot 6 is Schizo1/B
      1               700000     2                  slot 5 is Schizo1/B 
      1               700000     1                  slot 4 is Schizo1/B 

      0               600000     1                  slot 3 is Schizo0/A 
      0               700000     3                  slot 2 is Schizo0/B 
      0               700000     2                  slot 1 is Schizo0/B 
      0               700000     1                  slot 0 is Schizo0/B 

      cPCI IO Board Topology for 6/4 slots:
      =====================================

      4-slot 
      ------ 
      Even/Odd AID    Offset    device #             Slot assignment
      ------------    ------    --------            --------------------
      0               600000     1                  slot 0 is Schizo0/A 
      1               600000     1                  slot 1 is Schizo1/A
      0               700000     1                  slot 2 is Schizo0/B
      1               700000     1                  slot 3 is Schizo1/B

      6-slot 
      ------
      Even/Odd AID    Offset    device #             Slot assignment
      ------------    ------    --------            --------------------
      0               600000     1                  slot 0 is Schizo0/A 
      1               600000     1                  slot 1 is Schizo1/A
      0               700000     1                  slot 2 is Schizo0/B(shared)
      0               700000     2                  slot 3 is Schizo0/B(shared)
      1               700000     1                  slot 4 is Schizo1/B(shared)
      1               700000     2                  slot 5 is Schizo1/B(shared)


In our case, the cards present in the slots 0, 1 and 2 can be the source 
of the error. The /etc/path_to_inst give us a description of what is
connected to each slot of the IO Board.                                    
"/ssm@0,0/pci@18,700000/pci@1/SUNW,isptwo@4" 0 "isp"     >>> SLOT 0   Sunswift PCI
"/ssm@0,0/pci@18,700000/pci@1/SUNW,hme@0,1" 0 "hme"	 >>> SLOT 0   Sunswift PCI
			    ^
		Device # ___|
"/ssm@0,0/pci@18,700000/SUNW,qlc@3" 0 "qlc"		 >>> SLOT 2  PCI Single FC Host Adapter
				 ^
		     Device # ___|
"/ssm@0,0/pci@18,700000/bootbus-controller@4" 0 "sgsbbc" >>> NOT CONTROLLED BY THE SCHIZO CHIP


In this case, replace the Sunswift PCI (Slot0) and the PCI Single FC 
Host Adapter (Slot 2) cards on the IO board 6.                                    

INTERNAL SUMMARY:

See http://hes.west/espg/safedocs/alpha-docs.html for more information on Serengeti physical device mapping

SUBMITTER: Renaud Manus APPLIES TO: ATTACHMENTS:


Copyright (c) 1997-2003 Sun Microsystems, Inc.