Document fcos/A0182-1


FCO #: A0182-1

SYNOPSIS: 18GB and 36GB IBM disk drives experiencing high failure rate in high
          humidity and high temperature environments

DATE: Nov/13/2001

KEYWORDS: 18GB and 36GB IBM disk drives experiencing high failure rate in high
          humidity and high temperature environments


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                               FIELD CHANGE ORDER
                  (For Authorized Distribution by SunService)

                     
SYNOPSIS: 18GB and 36GB IBM disk drives experiencing high
          failure rate in high humidity and high temperature
          environments.
          
Sun Alert: Y
 
TOP FIN/FCO REPORT: Yes
 
PRODUCT_REFERENCE:  IBM Disk Drives 18GB & 36GB

PRODUCT CATEGORY: Storage / Disk

PRODUCT AFFECTED:

Mkt_ID   Platform   Model     Description       Serial Number
------   --------   -----     -----------       -------------
Systems Affected:
------- --------
 -        A14        All      Ultra 2                 -
 -        A20        All      Ultra 450               -
 -        A23        All      Ultra 60                -
 -        A27        All      Ultra 80                -
 -        N04        All      Netra T1120             -
 -        N03        All      Netra T1125             -
 -        N21        All      Netra T1 DC200          -
 -        N14        All      Netra t1405             -
 -        N15        All      Netra t1400             -
 -        E250       All      Ultra Enterprise 250    -
 -        E450       All      Enterprise 450          -
 -        E3000      All      Ultra Enterprise 3000   -
 -        E3500      All      Ultra Enterprise 3500   -
 -        E4000      All      Ultra Enterprise 4000   -
 -        E4500      All      Ultra Enterprise 4500   -
 -        E5000      All      Ultra Enterprise 5000   -
 -        E5500      All      Ultra Enterprise 5500   -
 -        E6000      All      Ultra Enterprise 6000   -
 -        E6500      All      Ultra Enterprise 6500   -
 -        E10000     All      Ultra Enterprise 10000  -
 -        S8         All      Sun Fire 3800           -
 -        S12        All      Sun Fire 4800           -
 -        S12i       All      Sun Fire 4810           -
 -        S24        All      Sun Fire 6800           -


X-Options Affected
--------- -------
 -       st D130          All    Netra st D130             -
 -       A1000            All    StorEdge A1000            -
 -       A13500           All    StorEdge A3500            -
 -       A13500FC         All    StorEdge A3500FC          -
 -       D1000            All    StorEdge D1000            -
 -       T3               All    StorEdge T3               -
 -       D240             All    StorEdge D240             -
 -       MultiPack        All    StorEdge MultiPack        -
 -       st A1000/D1000   All    Netra st A1000/D1000      -
 -       ct 400/800       All    Netra ct 400/800          -
 

AFFECTED PARTS:
Part Number    Description                              Model
-----------    -----------                              -----
540-4401-01    DRV NEBS 18GB 10K 1 SCSI W/S&P             -
540-4921-01    18GB SCSI 10K 1 NEBS SD DRIVE              -
540-4520-01    DRV ASSY 36GB 1 SCSI SPUD&PLAT             -
540-4689-01    DRV NEBS 36GB 10K 1 SCSI W/S&P             -
540-4440-01    ASSY 18GB 10K 1 FC LP W/SLED               -
540-4367-01    ASSY 36GB 10K 1 FC LP W/SLED               -
540-4178-01    DRV 18GB 10K 1 SCSI W/SPUD&PLT             -
540-4177-01    DRV assy 18GB10K 1 SCSI W/SPUD             -
595-5471-01    FRU MEDIA TRAY18.2GB HDD			  _

(SCSI Devices)
Type    Vendor    Model     SerialNumber(Min)    SerialNumber(Max)    Firmware
----    ------    -------   ------------------   ------------------   --------
Disk	IBM	  DDYS-T1835         -                   -                 -
Disk	IBM	  DDYS-T3695         -                   -                 -
Disk	IBM	  DDYF-T1835         -                   -                 -
Disk	IBM	  DDYF-T3695         -                   -                 -


REFERENCES :

  BugID: 4490041                     
  ESC: 531685                      
  SunAlert: SA-40130                    
  WWStopShip: P200-20006                  
  FIN: I0724-2 
  DPCO: 278.A      

PROBLEM DESCRIPTION :

Any system with 18.2GB and 36GB IBM disk drives may be susceptible
to early life failures.

Failure analysis results have highlighted a significant failure rate
for Drive Not Ready (DNR) on returned IBM 18GB and 36GB disk drives.
These failures have been observed to occur as a result of the disk
drives either being stored or operated in extremely hot and humid
environments for an extended period of time.

Root Cause Analysis has identified several contributing factors leading
to drive failures.  None of the factors stand alone, and all the factors
must occur or be present for the identified DNR failure mode.  The
various factors are:

  . Microscopic talcum residue,
  . Disks packaged in systems in drive trays,
  . Exposure to high temperature (30degC or above),
  . High humidity (90% or above) for a period greater than 20 days.

Sample error messages:

     /sbus@7,0/QLGC,isp@0,10000/sd@1,0 (sd46):
     Error for Command: write                   Error Level: Fatal
     Sense Key: Hardware Error
     ASC: 0x2 (no seek complete), ASCQ: 0x0, FRU: 0x0

    10098107  c1t8d0   540-4178-01  DDYS-T18350  01061XE682

     /sbus@7,0/QLGC,isp@0,10000/sd@1,0 (sd46):
     Error for Command: read                    Error Level: Fatal
     Sense Key: Vendor Unique
     ASC: 0x80 (), ASCQ: 0x0, FRU: 0xa

    10102117  c2t10d0  540-4178-01  DDYS-T18350  01061XE522  108305

     /sbus@7,0/QLGC,isp@0,10000/sd@a,0 (sd54):
     Error for Command: read                    Error Level: Fatal
     Sense Key: Media Error
     ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0x0

    10102117  c1t5d0   540-4178-01  DDYS-T18350  01061XE630  108305

     /sbus@3,0/QLGC,isp@0,10000/sd@0,0 (sd15):
     Error for Command: write                   Error Level: Fatal
     Sense Key: Media Error
     ASC: 0x3 (peripheral device write fault), ASCQ: 0x0, FRU: 0x0

    10088635  c2t13d0  540-4178-01  DDYS-T18350  01061XE750

     /sbus@6,0/QLGC,isp@1,10000/sd@d,0 (sd42):
     Error for Command: load/start/stop         Error Level: Retryable
     Sense Key: Not Ready
     ASC: 0x4 (LUN not ready), ASCQ: 0x0, FRU: 0x0

    10096449  c2t9d0   540-4178-01 DDYS-T18350  01061XE634

The most frequent failures seen are, "Drive not ready" or the drive
may produce excessive read, write or media errors.

- IBM builds drives and ships almost the same day, in packaging with
  desiccant packs.

- Only after the drives are assembled in Sun enclosures are they
  susceptible to this problem.

- Drives in enclosures would have to sit in a high temperature, high humidity
  environment for more than 20 days before condensation could become an issue.

- Drive that are running in a system for more than 90 days should not
  experience this problem.

- If after 90 days the drive is stopped for any period of time and
  NOT exposed to high temperature, high humidity, it will not experience
  this problem.

- Drives in arrays, powered up but not configured may go into a sleep mode.
  If these drives were previously exposed to high temperature, high
  humidity, and were "sleeping" for a period of time, the problem could
  surface when the drives are accessed.

Corrective action was implemented in Manufacturing by purging all suspect
IBM Drives via Worldwide Purge P200-20006 issued on August 25, 2001.  
Corrective Action was put in place in Enterprise Services via DPCO# 278 
on Janurary 24, 2002.

A copy of either the Sun Legal approved Customer Letter or the Frequently
Asked Questions document can be accessed via the following URLs;

CUSTOMER LETTER;

  http://sdpsweb.EBay/FIN_FCO/FCO/FCO_A0182-1_Dir/IBM_CUST_Letter31OCT.sdw
  
    Note: To view document click on the above URL, then save to your local
          disk using your Netscape 'file' button and select 'save as', then
          open file locally using StarOffice.
    
FREQUENTLY ASKED QUESTIONS;
    
  http://sdpsweb.EBay/FIN_FCO/FCO/FCO_A0182-1_Dir/Q&A
  
  
PLANNED IMPLEMENTION COMPLETION DATE: June 30, 2002


IMPLEMENTATION :

 ---
|   |   MANDATORY (Fully Pro-Active)
 ---

 ---
| X |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan)
 ---

 ---
|   |   UPON FAILURE
 ---

REPLACEMENT TIME ESTIMATE : 0.25 hours

SPECIAL CONSIDERATION :

Before proactively replacing any drives the Sun Authorized Field
Representative should complete a Disk Drive Reliability Check as
outlined below.

            ***  Disk Drive Reliability Check  ***

Answer the following questions to help determine if FCO AO182-1 should
be applied.
------------------------------------------------------------------------
------------------------------------------------------------------------
#1.  Define the customer's install base by part number.  Use explorer to
     identify the part numbers.

 Product        Part Number        Model        Quantity

Servers         540-4177-01     (18.2GB)        ___________
Dilbert A/D1000 540-4178-01     (18.2GB)        ___________
Avalanche       540-4401-01     (18.2GB)        ___________
NEBS            540-4921-01     (18.2GB)        ___________
T3              540-4440-01     (18.2GB)        ___________
T3              540-4367-01     (36.4GB)        ___________
Dilbert A/D1000 540-4520-01     (36.4GB)        ___________
Avalanche       540-4689-01     (36.4GB)        ___________


#2.  List the number of failures by part number and manufacturer:

Part Number       Model   Quantity      Manufacturer    Failure Mode

540-4177-01     (18.2GB)___________     ___________     ___________
540-4178-01     (18.2GB)___________     ___________     ___________
540-4401-01     (18.2GB)___________     ___________     ___________
540-4921-01     (18.2GB)___________     ___________     ___________
540-4440-01     (18.2GB)___________     ___________     ___________
540-4367-01     (36.4GB)___________     ___________     ___________
540-4520-01     (36.4GB)___________     ___________     ___________
540-4689-01     (36.4GB)___________     ___________     ___________

#3. Are the majority of the drives listed in #2 IBM drives?

    If no, reevaluate the situation, send drives in for failure analysis
    via CPAS, this FCO does not apply to this case.

#4. Does the number of drive failures exceed the expected failure
    rate?   (See chart below)

    Note: an example follows.

 |------------------------------------------------------------
 | # of drives    | 3 months | 6 months| 9 months| 12 months |
 |  on site       |          |         |         |           |
 |===========================================================|
 |     300        |    1-2   |  3-4    |    4-5  |     6     |
 |----------------|----------|---------+---------+-----------|
 |     400        |     2    |   4     |      6  |     8     |
 |----------------|----------|---------|---------|-----------|
 |     500        |    2-3   |  5-6    |   7-8   |     10    |
 |----------------|----------|---------|---------|-----------|
 |    1,000       |     5    |   10    |     15  |     20    |
 |----------------|----------|---------|---------|-----------|
 |    1,500       |    8-9   |  14-15  |     24  |     30    |
 |----------------|----------|---------|---------|-----------|
 |total  failures |          |         |         |           |
 |-----------------------------------------------------------|

 Example:

 In this example customer X has the following IBM
 drives in the data center:

 Quantity   drive type            install          # of
                                  time             failures

 300        540-4178-01 (18.2GB)   approx 9 months    9

 500        540-4520-01 (36.4GB)   approx 9 months    14

 400        540-4520-01 (36.4GB)   approx 12 months   11

 That's 300 18GB drives installed for nine months, and also
 500 36GB drives installed for nine months.  The customer also
 has 400 36GB drives that have been installed for about twelve
 months.

 Example: Fill in a chart defining your customer's information:

                   -------------------------------------------
                  | List # of failures during install period |
 |------------------------------------------------------------
 | # of drives    | 3 months | 6 months| 9 months| 12 months |
 |  on site       |          |         |         |           |
 |===========================================================|
 |     300        |          |         |     9   |           |
 |----------------|----------|---------|---------|-----------|
 |     400        |          |         |         |     11    |
 |----------------|----------|---------|---------|-----------|
 |     500        |          |         |    14   |           |
 |----------------|----------|---------|---------|-----------|
 |    1,000       |          |         |         |           |
 |----------------|----------|---------|---------|-----------|
 |    1,500       |          |         |         |           |
 |----------------|----------|---------|---------|-----------|
 |total  failures |          |         |    23   |     11    |
 |-----------------------------------------------------------|

 Comparing the customer drive reliability profile with the
 expected chart below.
 --------------------------------------------------------------
 | Drives/install time  | Expected failures | Actual failures |
 |====================================================+=======|
 | 300/9 months         |    4-5            |       9         |
 |----------------------|-------------------|-----------------|
 | 500/9 months         |    7-8            |      14         |
 |----------------------|-------------------|-----------------|
 | 400/12 months        |    8              |      11         |
 |------------------------------------------------------------|

 Looking at the numbers we can now answer the question, "Does the number
 of drive failures exceed the expected failure rate?"

 In this example the answer is YES, the customer's failure rate exceeds
 the expected.

   We would expect no more than 4-5 18GB drives to fail in 9 months,
   the customer had 9.

   We would expect no more than 7-8 36GB drives to fail in 9 months,
   the customer had 14.

   We would expect no more than 8 36GBB drives to fail in 12 months,
   the customer had 11.


#5. Is there a possibility the systems were stored in a high heat (30degreesC
    or above) and high humidity (90% or above), for 20 days or more?

    Examples include:
    At customs, a reseller, or a non air conditioned data or storage area.

    If yes, proceed to number 6.

    If no, reevaluate the situation, send drives in for failure analysis, this
    FCO may not apply.

#6. Have the majority of the drives failed due to DNR errors? Y or N

    If yes to #'s 4, 5 and 6, it is recommended the IBM drives should be
    replaced per this FCO.

    If no, reevaluate the situation, send drives in for failure analysis, this
    FCO may not apply.

    NOTE: You must run explorer script to identify all IBM drives.
    

CORRECTIVE ACTION :

IMPORTANT! Please follow the Disk Drive Reliability Check listed under
the Special Consideration section of this FCO prior to implementing
any proactive swap activity.

Upon failure or upon customer need replace as follows;

 replace 540-4177-01 (IBM Only) with 540-4177-01 (Non IBM)

 replace 540-4178-01 (IBM Only) with 540-4178-01 (Non IBM)

 replace 540-4921-01 (IBM Only) with 540-4921-01 (Non IBM)

 replace 540-4440-01 (IBM Only) with 540-4440-01 (Non IBM)

 replace 540-4367-01 (IBM Only) with 540-4367-01 (Non IBM)

 replace 540-4520-01 (IBM Only) with 540-4520-01 (Non IBM)

NEBS Compliance:

If maintaining NEBS3 Compliance is essential to your customer it
is recommended that proactive swaps NOT be implemented unless
absolutely necessary as there is limited NEBS3 materials available.
For NEBS3 Compliance replace either the 18GB or 36GB drive with
the new NEBS3 Compliant 36GB drive as follows;

 replace 540-4401-01 with 540-5160-01

 replace 540-4689-01 with 540-5160-01

If maintaining NEBS3 Compliance is NOT essential to your customer
replace as follows:

 replace 540-4401-01 (IBM Only) with 540-4401-01 (Non IBM)

 replace 540-4689-01 (IBM Only) with 540-4689-01 (Non IBM)

For proactive replacement of non-failed drives mark the Defective Material
Tag (DMT) with the letters, "FCO" in bold letters.  For failed drives
mark
the DMT as usual with the failure information.

COMMENTS :

IMPORTANT! SECURE SITE ACTIVITY

Below are the instructions for implementing this FCO at Secure Sites
where no drive will be returned.  IBM will require documentation that
meets the following requirements:

1) Documentation on Customer or Government letterhead.  If the customer does
   not wish to use their letterhead, or Sun does not wish to disclose the who
   a customer is, this note can be on Sun letterhead.

2) Documentation should be addressed to:

        Daria Casey
        IBM
        5600 Cottle Road
        Dept LJK, Building 010
        San Jose, CA  95193

3) Documentation should also be faxed to Daria at (408) 979-1344 prior to
   mailing the original.

4) The documentation should state that the parts contain classified
   information and can not be returned and are being scrapped.  The note
   should then list the part number and serial number of all drives being
   scrapped.

BILLING TYPE:

 Warranty: Sun will provide parts at no charge under Warranty
           Service. On-Site Labor Rates are based on how the
           system was initially installed.

 Contract: Sun will provide parts at no charge. On-Site Labor Rates
           are based on the type of service contract.

 Non Contract: Sun will provide parts at no charge. Installation by
               Sun is available based on the On-Site Labor Rates
               defined in the Price List.

--------------------------------------------------------------------------
Implementation Footnote:
________________________

i)   In case of Mandatory FCOs, Enterprise Services will attempt to contact
     all known customers to recommend the part upgrade.

ii)  For controlled proactive swap FCOs, Enterprise Services mission critical
     support teams will initiate proactive swap efforts for their respective
     accounts, as required.

iii) For Replace upon Failure FCOs, Enterprise Services partners will
     implement the necessary corrective actions as and when they are required.

--------------------------------------------------------------------------

All released FINs and FCOs can be accessed using your favorite network
browser as follows:

SunWeb Access:
______________

* Access the top level URL of http://sdpsweb.EBay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.

SunSolve Online Access:
_______________________

* Access the SunSolve Online URL at http://sunsolve.Central/

* From there, select the appropriate link to browse the FIN or FCO index.

Internet Access:
_______________

* Access the top level URL of https://infoserver.Sun.COM

--------------------------------------------------------------------------
General:
________

Send questions or comments to finfco-manager@sdpsweb.EBay

---------------------------------------------------------------------------



Copyright (c) 1997-2003 Sun Microsystems, Inc.