Document fins/I0728-1


FIN #: I0728-1

SYNOPSIS: Procedures for dealing with MC Timeout arbstops on E10000 platform

DATE: Oct/12/01

KEYWORDS: Procedures for dealing with MC Timeout arbstops on E10000 platform


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)



SYNOPSIS: Procedures for dealing with MC Timeout arbstops on E10000 
          platform.


Sun Alert:          No

TOP FIN/FCO REPORT: No 
 
PRODUCT_REFERENCE:  Enterprise 10000 Servers  
 
PRODUCT CATEGORY:   Server / Service 


PRODUCTS AFFECTED:  

Systems Affected
----------------
Mkt_ID   Platform   Model   Description             Serial Number
------   --------   -----   -----------             -------------
  -      E10000      ALL    Ultra Enterprise 10000        -


X-Options Affected
------------------
Mkt_ID   Platform   Model   Description   Serial Number
------   --------   -----   -----------   -------------
  -         -         -          -              -


PART NUMBERS AFFECTED:

Part Number            Description                      Model
-----------            -----------                      -----
501-4347-51 or lower   ECB ASSY SYSTEM E10000             -
501-4903-02 or lower   ECB ASSY SYSTEM STARFIRE           -
501-5240-03 or lower   ECB ASSY SYSTEM STARFIRE TSTD      -
501-5693-04 or lower   ECB ASSY SYSTEM SF+ECC TSD         -
501-4786-03 or lower   ECB ASSY SYSTEM E10000 TESTED      -
501-5278-01            ECB ASSY SYSTEM STARFIRE+ TSTD     -
501-5279-01            ECB ASSY SYSTEM STARFIRE+ TSTD     -
501-5959-02 or lower   ASSY CIC1 SB 4X400/8M MSRAM        -
501-5960-02 or lower   ASSY CIC1 SB 4X400/8M MSRAM        -   
501-5934-01            ASSY AUTOBOM TEST2                 - 
501-5935-01            ASSY AUTOBOM TEST3                 -


REFERENCES:

PatchId: 108885 - SSP 3.3: Modify POST/SSP to support CIC2 asic and 
                     new ecache SRAM.
         110304 - SSP 3.4: autoconfig changes required to support new 
                     ecache srams.


ESC:     521379 - New CICd Starfire - two arbstops show DTag Parity errors.
         527175 - Multiple problems on E10K, opening Level 0 escalation.
         530787 - Domain has had 3 arbstops with MC Timeouts. 
         530983 - Repeated Arbstops Timeout Waiting For Data to Match Address.
         531952 - MC Timeout on System Board 0.
         531966 - MC timeout.
         532017 - MC Timeout waiting for data to match address.
         532018 - cdt/GBGPDA3/domain had an arp stop during bootup, need to 
                  analyze.
         532042 - Cu domain getting MC timeout errors.
         532049 - ARBSTOP: Timeout waiting for data to match addressFAIL MC A:.
         532052 - Escalation due to MC timeout arbstop.
         532106 - E10k domain crash with Arbstop.  MC timeout error in arbstop.
         532116 - Arbstop Timeout waiting for data to match address.
         532392 - Domain arbstops: MC 7 Timeout waiting for data to match 
                  address.
	 532500 - MC TIMEOUT arbtop, need to escalate per CTE.
	 532514 - E10k: MC Timeout.
	 532613 - Cu domain crashed and went to single-user mode. Have arbstop 
                  MC Timeout  error.
	 532644 - MC Timeout Arbstop occured 2times for a 3days.
	 532655 - MC 0 : Timeout waiting for data to match address.
	 532686 - MC Timeout Arbstop.
	 532717 - Multiple arbstops on domain, one is MC Timeout.
	 532743 - MC timeout arbstop on domain.
	 532755 - MC Timeout : Timeout waiting for data to match address.
	 532787 - MC timeout occured on different domains continuity.
	 532789 - MC timeout arbstop.
	 532794 - Arbstop: MC Timeout waiting for data to match address.
	 532816 - Arbstop - MC Timeout.
	 532836 - Arbstop MC TIMEOUT  Timeout waiting for address to match 
                  data.
	 532845 - MC timeout with Timeout waiting for data to match address.
	 532857 - ErrFlag[27]: Timeout waiting for data to match address; FAIL 
                  MC 7.
	 532886 - MC timeout Arbstop E10K detected a fatal error; domain(s) 
                  stopped and restarted.
	 532910 - MC TIMEOUT  Timeout waiting for data to match address.
	 532946 - ARBSTOP: Timeout waiting for data to match addressFAIL MC 
                  A: follow up.
	 532956 - Domain adam got mc timeout arb stop.
	 532983 - MC Timeout Arbstop on domain ogmadom1.
	 533009 - Timeout waiting for data to match address.
	 533099 - MC timeout arbstop.
	 533169 - MC Timeouts on 2 domains.

      
PROBLEM DESCRIPTION:    

MC Timeout Arbstop occurrences on E10000 systems can have various 
causes and may be difficult to diagnose.  This FIN describes the
service procedures to follow in the event a customer experiences this
specific problem.

MC Timout Arbstops are typically identified by the following messages
or combination of messages embedded in redx's wfail output:     
         
   MC:  Timeout waiting for data to match address

OR:

   MC:  Timeout waiting for address to match data

OR both errors together:

   MC:  Timeout waiting for data to match address
   MC:  Timeout waiting for address to match data

If SSP3.3 is running with patch 108885 or SSP3.4 is running with
patch 110304, the following message would also be displayed:
     
   MC Timeout: The reporting MC is most likely a victim of
               a transaction dropped by other hardware in the domain,
               which did not detect any error.  The MC reporting the
               timeout is not likely to be the cause of the problem.
     
Previously, CPRE-HSG asked that all MC Timeout events be escalated.  Thanks
to the participation of ES and other organizations, there is now a web
page, located at http://cpre-amer.west/esg/hsg/starfire/xftt/mcto.html which
details a diagnostic guide to troubleshoot MC Timeout arbstops.  Please
refer to this website before escalating to CPRE - Highend Servers.

 NOTE:  SSP Patches are still considered mandatory.
       Installing the aforementioned patches for SSP 3.3 or SSP 3.4 as
       soon as possible would be a good proactive step for all your
       customers.  Recommended SSP patches can be found at:

          http://cpre-amer.west/esg/hsg/starfire/patches.html


IMPLEMENTATION:  
 
         ---
        |   |   MANDATORY (Fully Pro-Active)
         ---    
         
  
         ---
        |   |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---
         

CORRECTIVE ACTION:

The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above 
mentioned problem. 

Please refer to the following website for further Guidance to 
Troubleshooting MC Timeout Arbstops:

     http://cpre-amer.west/esg/hsg/starfire/xftt/mcto.html


COMMENTS: 

None 

--------------------------------------------------------------------------

Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission
critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as
the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO
index.

Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist.  Edist can be 
  accessed internally at the following URL: http://edist.corp/.
  
* From there, follow the hyperlink path of "Enterprise Services
Documenta- 
  tion" and click on "FIN & FCO attachments", then choose the
appropriate   
  folder, FIN or FCO.  This will display supporting directories/files
for 
  FINs or FCOs.
   
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@sdpsweb.EBay
---------------------------------------------------------------------------


Copyright (c) 1997-2003 Sun Microsystems, Inc.