Document fins/I0559-2


FIN #: I0559-2

SYNOPSIS: UPDATED FIN. Using Fast Arbitration on E3X00, E4X00, E5X00 systems
          can cause intermittent Fatal Resets.

DATE: Mar/16/00

KEYWORDS: UPDATED FIN. Using Fast Arbitration on E3X00, E4X00, E5X00 systems
          can cause intermittent Fatal Resets.


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)



SYNOPSIS:  UPDATED FIN. Using Fast Arbitration on E3X00, E4X00, 
           E5X00 systems can cause intermittent Fatal Resets.


TOP FIN/FCO REPORT: Yes 
 
PRODUCT_REFERENCE:  EXX00 systems w/OBP prior to 3.2.24
 
PRODUCT CATEGORY:   Server / SW Admin; Server / System CPU Firmware 

PRODUCTS AFFECTED:  
  
Mkt_ID   Platform   Model   Description   Serial Number
------   --------   -----   -----------   -------------
Systems Affected
----------------

  -       E3000      ALL    Ultra Enterprise 3000   -
  -       E3500      ALL    Ultra Enterprise 3500   -
  -       E4000      ALL    Ultra Enterprise 4000   -
  -       E4500      ALL    Ultra Enterprise 4500   -
  -       E5000      ALL    Ultra Enterprise 5000   -
  -       E5500      ALL    Ultra Enterprise 5500   -
 
X-Options Affected
------------------

X2602A   -   -   X-OPT, INT CPU/MEM BD, EXX00       -
X2612A   -   -   X-OPT, INT I/O BD, EXX00 W/FC-AL   -
X2622A   -   -   X-OPT, INT GRAPHICS I/O BD, EXX00  -
X2632A   -   -   X-OPT, INT PCI I/O BD, EXX00       -


PART NUMBERS AFFECTED: 
 
Part Number     Description                       Model
-----------     -----------                       -----
501-4881-03     ASSY, I/O BD, PCI, SUNFIRE+         -
501-4882-02     ASSY, TSTD, CPU BD, SUNFIRE+        -
501-4883-04     ASSY, TSTD, IO BD, SOC+, SUNFIRE+   -
501-4884-04     ASSY, TSTD, IOG, SOC+, SUNFIRE+     -
F501-2976-05   	FRU,ASSY,TSTD,CPUBD,SUNFIRE RP      -    
F501-4266-06   	FRU,TSTD,IO BD,W/SOC+,SUNFIRE       -   
F501-4287-04   	FRU,ASSY,TSTD,I/O BD, SUNFIRE       - 
F501-4288-04   	FRU,ASSY,TSTD,IOG BD,SUNFIRE        - 
F501-4312-03   	FRU,ASSY,TSTD,CPUBD,SUNFIRE RP      -  
F501-4882-03   	FRU,ASSY,TSTD,CPU BD,SUNFIRE+       - 
F501-4883-05   	FRU,TSTD,IO BD,SOC+,SUNFIRE+        - 
F501-4884-05   	FRU,TSTD,IOG,SOC+,SUNFIRE+          - 
F501-4926-03    FRU,ASSY,PCI I/O SNFR+ W/5V RISER   -
F501-4325-02    FRU,ASSY,PCI, I/O W/5V RISER C	    -


REFERENCES:

BugId: 4298992
PatchId: 103346 or greater
ECO: WO_16770
URL: http://sunsolve.sun.com/pub-cgi/retrieve.pl?doc=enotify/14838
      

PROBLEM DESCRIPTION:

From FIN I0559-1
----------------
The Enterprise Server systems as shown above implement an arbitration
mode called fast arbitration and are susceptible to a known problem
when fast arbitration mode is enabled.  The symptom and the problem can
be identified on an affected Enterprise Server if the following errors
are found in the messages file, displayed on the systems's console, or
as a result of the Enterprise Server encountering a fatal reset;

   A FATAL RESET and the error sent to the console port is either 
   identified as a FTA_PERR or FTC_PERR, or FT_ARBERR, or in any 
   combination.

As a result, the FAST_ARB arbitration mode introduces 200 mv of noise
on the system bus.  This noise causes errors to be generated and could
cause several other types of errors that may be difficult to isolate.  
See BugId# 4298992 for further details.

Fast arbitration allows for the bus request and address to be driven in
the same cycle.  If only one device is requesting the bus the address
is valid and the data is returned two cycles earlier than the normal
arbitration mode.  If there are more than one requester on the bus the
address is invalid and then the arbitration winner will drive the
address on the bus two cycles after the request.

The result of more than one requester on the bus driving the address
is a bus collision.  This collision generates 200 mV of noise on the
bus, or, it can also be said the collision sometimes generates a FATAL
system reset.
	
The FATAL reset will cause the system to crash, it will cause POST to
map out a system board, but the system will still have the potential
for seeing the same error and the same thing will occur over again.
 
Enterprise 3X00 through 5X00 Server systems that have firmware prior to
3.2.24 have fast arbitration mode enabled and are at risk of failure.
Engineering has determined that not every 100MHz BackPlane EXX00 system
will experience this problem, however, with the 400 MHz processor the
probability of seeing this on a EXX00 Server is over 80%." Engineering
also notes that this problem has not been experienced on EXX00 Servers
using the 167, 250 and 336 MHz processors but the potential for
encountering the problem still exists."
  
UPDATES TO FIN I0559-2;
-----------------------
The primary reason to release the -2 version of this FIN is to make
this corrective action mandatory.  In the absence of implementing the
given fix customers are exposed to the risk of downtime due to system
crashes.  It is highly recommended that all affected systems (E3X00,
E4X00, E5X00) are upgraded with the 3.2.24 OBP Patch.

 
IMPLEMENTATION:   
 
         ---
        | X |   MANDATORY (Fully Pro-Active)
         ---    
         
  
         ---
        |   |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        |   |   REACTIVE (As Required)
         ---
         
 

CORRECTIVE ACTION: 

Enterprise Customers and authorized Field Service Representatives may
avoid the above mentioned problems by following the recommendations
on Ultra Enterprise Servers E3x00, E4x00 and E5x00 as shown below;

If the problem has been encountered and boards have been deconfigured
by POST during boot following this type of Fatal Reset, then shutdown
the system properly and power-cycle.  For this problem, power cycling
will bring the boards back on line.

To prevent the problem from happening again the following steps should
be carried out.

1.  If the suspect EXX00 Server is attached to an StorEdge A5X00
    array be sure to reference the latest SSA/A5X00 Software/Firmware
    Configuration Matrix from the URL shown above (References). 
    After all A5X00 PROM levels have been verified to be current,  
    proceed to step 2.

2.  Verify that the OBP Prom Version is less than 3.2.24 by using the command
    "/usr/sbin/prtconf -V".
 
3.  Upgrade flash PROM's to 3.2.24, this can be done by implementing
    patch-id 103346 or greater as shown below:

Patch-ID# 103346 or greater
Keywords: Ultra Enterprise flashprom update 3.2.24 UNIX H/W
Synopsis: Hardware/PROM: Ultra Enterprise 3x00/4x00/5x00/6x00 flashprom update
Date: Jan/07/00

Solaris Release: 2.5.1;  2.6 7

SunOS Release: 5.5.1;  5.6 5.7

Unbundled Product: Hardware/PROM 

Unbundled Release: CPU: OBP 3.2.24, POST 3.9.24; IO Type 1/2: FCODE
1.8.24, iPOST 3.4.24; IO Type 3: FCODE 1.8.24, iPOST 3.0.24 ; IO Type
4/5: FCODE 1.8.24, iPOST 3.4.24


    NOTE: This utility is *not* Solaris Release version dependent.
          The list of releases shown under the "Solaris Release" and
          "SunOS Release" sections may not be complete.
          Flash-update-10 or later version can run on any 32bit or 64bit OS.
          Flash-update-09 or earlier version may run on any 32bit OS,
          but not on any 64bit OS.


The above patch is not a standard Solaris patch package and the
"Special Install Instructions" in the README should be read and
executed in contrast to standard patch install procedures.

COMMENTS:

NOTE:  The E6X00 Server does not utilize fast arbitration and
       as a result has been excluded as an effected platform
       in this FIN.

--------------------------------------------------------------------------
Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist.  Edist can be 
  accessed internally at the following URL: http://edist.corp/.
  
* From there, follow the hyperlink path of "Enterprise Services Documenta- 
  tion" and click on "FIN & FCO attachments", then choose the
appropriate   
  folder, FIN or FCO.  This will display supporting directories/files for 
  FINs or FCOs.
   
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
---------------------------------------------------------------------------


Copyright (c) 1997-2003 Sun Microsystems, Inc.