Document fins/I0763-1


FIN #: I0763-1

SYNOPSIS: Sun Fire 15000 AXQ ASIC chip bug fixed by Solaris 8 Patches

DATE: Feb/19/02

KEYWORDS: Sun Fire 15000 AXQ ASIC chip bug fixed by Solaris 8 Patches


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)



SYNOPSIS: Sun Fire 15000 AXQ ASIC chip bug fixed by Solaris 8 Patches. 


Sun Alert:          No

TOP FIN/FCO REPORT: Yes 
 
PRODUCT_REFERENCE:  AXQ ASIC for Sun Fire 15000  
 
PRODUCT CATEGORY:   Server / Service 


PRODUCTS AFFECTED:

Systems Affected
---------------- 
Mkt_ID   Platform      Model    Description            Serial Number
------   --------      -----    -----------            -------------
  -        F15K         ALL     Sun Fire 15000               -


X-Options Affected
------------------
Mkt_ID     Platform   Model   Description         Serial Number
------     --------   -----   -----------         -------------
  -           -         -          -                    -


PART NUMBERS AFFECTED:

Part Number               Description                       Model
-----------               -----------                       -----
501-5179-18 or lower      ASSY ECB MECH SYS EXP STARCAT       -


REFERENCES:

BugId:   4508788 - limit PIOs in Schizo nexus on Starcat.
         4505200 - Cauldron deadlock.
         4496858 - glm driver panic when run stress testing on Cauldron 
                   cards.

PatchId: 108528 - SunOS 5.8: kernel update patch.
         110838 - axq platform getinfo CDC read/write.

      
PROBLEM DESCRIPTION:

Due to a bug in the AXQ ASIC which resides on the System Expander
Board of the F15K, a Solaris panic can occur if too many outstanding
noncacheable Programmed I/O transactions are awaiting service.  A
transaction may then timeout in the Schizo ASIC causing the Solaris
device driver(s) to panic.  An example panic:

  WARNING: [AFT1] Bus Error (BERR) Event on CPU129 Privileged Data 
           Access at TL=0, errID 0x00000417.10efa87e
	   AFSR 0x00100800<PRIV,BERR>.00000000 AFAR 0x00000428.00704010
	   Fault_PC 0x10032db4

           panic[cpu129]/thread=3001f6c8560: [AFT1] errID 0x00000417.10efa87e 
           BERR Error(s)
	 See below error message(s) for details

000002a1013aab60 SUNW,UltraSPARC-III+:cpu_aflt_log+45c (2a1013aac1e, 
    10149f28, 10149f00, 0, 2a1013aada8, 2a1013aac6b)
  %l0-3: 000002a1013ab210 000002a1013aae68 0000000000000003 0000000000000010
  %l4-7: 0000000000000000 00000000104cecc8 0000000000000000 0000000000000000
000002a1013aadb0 SUNW,UltraSPARC-III+:cpu_deferred_error+570 (104d9000, 
    c4000000000f, fe000000000c, 1, 1, 2a1013ab2f0)
  %l0-3: 0000000000000000 0000042800704010 0000000000000000 0000000000000032
  %l4-7: 0000000000000208 0000000000000000 0000000000000000 0000000000000000
000002a1013ab240 unix:prom_rtt+0 (30000abd518, 30004a64014, 12, 0, 0, 
3001fc4bb40)
  %l0-3: 0000000000000007 0000000000001400 0000004480001606 0000000010141cbc
  %l4-7: 0000000000000000 000002a100c85a28 0000000000000000 000002a1013ab2f0
000002a1013ab390 glm:glm_queue_target+60 (30004a10000, 30004a9dc80,
3001fc4bb30, 
    3001fc4bbb8, 0, 10)
  %l0-3: 0000030004a11068 0000030004a11070 000003001fc4bc58 0000030004a10000
  %l4-7: 000000000000000a 0000000000000020 0000000000000000 000003001fc4bb30
000002a1013ab440 glm:glm_accept_pkt+1e4 (1, 3001fc4bb30, 30004a10000, 
3001fc4bc58, 
    1, 30004a9dc80)
  %l0-3: 00000000102ee80c 000003001ccf7d00 00000000104ebd50 0000000010034c80
  %l4-7: 00000000104a7808 0000000000000000 000003001efc3aa0 000002a100c85ba0
000002a1013ab4f0 glm:glm_scsi_start+40 (3001fc4bb30, 1, 30004a10000, 
3000098e008, 
    a, 0)
  %l0-3: 0000030004a10018 000000000012e802 0000000000000800 0000000000100000
  %l4-7: 000000010322c000 000003001ef7b8c8 006003f6e2130000 0000031010b1f960
000002a1013ab5a0 sd:sdstart+534 (10149b80, 10149bc0, ffffffffffffffff, 12e802, 
0, 
    10000)
  %l0-3: 0000000000000000 000003000098e366 0000000000000005 000003000098e1d8
  %l4-7: 000003000098e108 000003000098e008 000000010332c000 000003001fa85ed8
000002a1013ab650 sd:sdstrategy+620 (3000098e108, 100000, 3001fa85ed8, 0, 
    3001fa85ed8, 3000098e008)
  %l0-3: 0000000000000000 0000000000000002 000003001ccedc08 000002a1013ab7e8
  %l4-7: 0000000000000002 0000000000000000 0000000000000000 0000000000000000
000002a1013ab700 genunix:default_physio+2d8 (100000, 0, 7fffffffffffffff, 
    3001ccedc08, 102bac28, 2a1013aba28)
  %l0-3: 000002a1013abae0 0000000000000040 00000000102baf3c 000003001fa85ed8
  %l4-7: 000000200000088a 00000000102bac28 0000000000000000 000000010322c000
000002a1013ab810 genunix:physio+20 (102baf3c, 0, 200000088a, 40, 102bac28, 
    2a1013aba28)
  %l0-3: 0000000010123d00 0000000000000000 0000000000000080 00000300008bf8f8
  %l4-7: 00000000104a7828 0000000000000000 0000000000000000 0000000000000000
000002a1013ab8c0 sd:sdread+9c (200000088a, 2a1013aba28, 200000088a, 0, 
3001e903650, 
    200000088a)
  %l0-3: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  %l4-7: 0000000000000000 000002a1013aba28 0000000000000000 0000000000000000
000002a1013ab970 genunix:pread+210 (2003, 3001e8be188, 3, 25c00400, 200000, 
10312c000)
  %l0-3: 00000000101b9bbc 0000000000000000 0000000000200000 000003001e903658
  %l4-7: 7fffffffffffffff 0000000000000000 0000000000000000 0000000000000000

This issue has been root caused to a bug in the AXQ ASIC.  An updated
AXQ is planned for a future release, however this software workaround
is permanent in Solaris, so the AXQ revision present in the system is
irrelevant.

The likelihood of encountering the failure is low, but because of the
impact if encountered, implementation of the corrective action is
recommended for all Gold and Platinum customers.


IMPLEMENTATION: 
 
         ---
        |   |   MANDATORY (Fully Pro-Active)
         ---    
         
  
         ---
        | X |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        |   |   REACTIVE (As Required)
         ---
         

CORRECTIVE ACTION:   

The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above
mentioned problem.

Please perform the following guideline as needed:

   Install Solaris 8 patches 108528 (or newer) and 110838 (or
   newer) to affected Sun Fire 15K domains.

NOTE: If the customer is participating in the Sun Fire 15K
      DR Beta Program, the DR specific kernel already contains
      an integrated software workaround.       

 
COMMENTS:

None 

--------------------------------------------------------------------------

Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@sdpsweb.EBay
--------------------------------------------------------------------------


Copyright (c) 1997-2003 Sun Microsystems, Inc.