Document fins/I0650-1


FIN #: I0650-1

SYNOPSIS: Seagate drive f/w prior to A72x may affect problem on T3

DATE: May/15/01

KEYWORDS: Seagate drive f/w prior to A72x may affect problem on T3


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)



SYNOPSIS: Sun StorEdge T3 Arrays containing Seagate Cheetah 4 18GB/36GB/73GB
	  drives with firmware versions prior to A72x may experience
	  severe performance degradation due to fatal drive timeouts.
              

TOP FIN/FCO REPORT: Yes 
 
PRODUCT_REFERENCE:  Cheetah Disk Drive Firmware  
 
PRODUCT CATEGORY:   Storage / SW Admin 

PRODUCTS AFFECTED:  
 
Mkt_ID   Platform   Model   Description                 Serial Number
------   --------   -----   -----------                 -------------
Systems Affected
----------------
   -     Anysys      ALL    System Platform Independent       -

X-Options Affected
------------------
   -       T3        ALL    StorEdge T3 Array                 -
X6713A      -         -     FC-AL 18.2GB 10KRPM 1" Disk       -    
X6714A      -         -     FC-AL 36.4GB 10KRPM 1.6" Disk     -
X6716A      -         -     FC-AL 18.2GB 10KRPM 1.6" Disk     -
X6717A      -         -     FC-AL 72.8GB 10KRPM 1.6" Disk     -


PART NUMBERS AFFECTED: 
----------------------
Part Number   Description              Model        Type   Vendor    Firmware
-----------   -----------              -----        ----   ------    --------
540-4440-01   18GB Assembly/FRU          -     
540-4367-01   36GB Assembly/FRU          -
540-4519-01   73GB Assembly/FRU          -
390-0053-01   Seagate ST318304FC 18GB  ST318304FC   Disk   Seagate    A726
390-0056-01   Seagate ST336704FC 36GB  ST318304FC   Disk   Seagate    A726
390-0036-01   Seagate ST173404FC 73GB  ST318304FC   Disk   Seagate    A727


REFERENCES:

BugId:   4411125 - Fatal drive timeouts on w/ Seagate Cheetah 4 drives 
                   causes poor I/O perf.

PatchId: 109115: Seagate Cheetah 4 Drive Firmware Update.
         110760: Seagate Cheetah 4 Drive Firmware Update.

ECO:     WO_20125

ESC:     528987 
         529160 
         529745 
         529874 
         530153 
         530172 
         530174

FIN:     I0609-3

URL:     http://infoserver.central/data/sshandbook/Systems/T3/components.html

Sun Alert: SA-26814

      
PROBLEM DESCRIPTION: 

Some customers have experienced serious intermittent performance
degradation associated with fatal drive timeouts on Sun StorEdge T3
Arrays with Seagate Cheetah 4 18GB/36GB/73GB disk drives.  In most
cases, the fatal drive timeouts were eventually corrected by the T3
controller firmware.  However, by the time corrective actions (e.g.
LIPs or Bus Resets) had completed, and the transients had subsided, the
performance impact was too large for the host might cause application
to fatal error.  In most cases, application performance would return to
acceptable levels after resetting the affected T3 array(s) or host
system(s), and would eventually degrade again.    

Any Sun StorEdge T3 Array with Seagate Cheetah 4 drives is susceptible
to this problem. Typical T3 syslog messages associated with this
problem are shown below:

  Feb 05 08:11:46 ISR1[2]: N: u2ctr ISP2100[0] Fatal timeout on u2d5
  Feb 05 08:11:46 ISR1[2]: N: u2ctr ISP2100[0] QLCF_ABORT_ALL_CMDS: Command
                              Timeout Pre-Gauntlet Initiated
  Feb 05 08:11:46 ISR1[2]: N: u2ctr ISP2100[0] Received LIP(f7,e8) async event
  Feb 05 08:11:50 ISR1[1]: N: u1ctr ISP2100[0] Received LIP(f7,e8) async event
  Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] Fatal timeout on u2d5
  Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] QLCF_START_GAUNTLET: Command
                              Timeout Gauntlet Initiated
  Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] Received unexpected LIP Reset
  Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] qlcf_i_mbox_cmd_complete 
                              (cmd = 0x6c, stat = 0x4000)
  Feb 05 08:12:06 ISR1[2]: N: u2d5 SVD_CHECK_ERROR: Command Timeout Detected
                              (path = 0)
  Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] qlcf_i_mbox_cmd_complete 
                              (cmd = 0x18, stat = 0x4000)
  Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] QLCF_I_MBOX_CMD_COMPLETE: 
                              Command Timeout Gauntlet Completed - SUCCESS
  Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] Received LIP(ff,e8) async event
  Feb 05 08:12:10 ISR1[1]: N: u1ctr ISP2100[0] Received unexpected LIP Reset
  Feb 05 08:12:10 ISR1[1]: N: u1ctr ISP2100[0] Received LIP(ff,e8) async event

To determine the type of disk drives in a T3 array, use "disk version
u[1-2]d[1-9]" from a T3 command prompt. Examine the columns labeled
"PRODUCT" and "REVISION". If the T3 contains any of the
following drive
types with firmware versions *earlier* than those listed below, the
disks may encounter this problem:

       --------------------------------------------------------
      | Drive Type          |   PRODUCT	          |   REVISION |
      |========================================================|
      | 18 Gb LP Cheetah 4  |   ST318304FSUN18G   |     A726   |
      |                     |                     |            |
      | 36 Gb LP Cheetah 4  |   ST336704FSUN36G   |     A726   |
      |                     |                     |            |
      | 73 Gb LP Cheetah 4  |   ST173404FSUN73G   |     A727   |
       --------------------------------------------------------

The Seagate Cheetah 4 drives contain a circular command buffer that
queues commands from the FC loop in the order they are received. In
certain cases, the pointer that keeps track of the current depth of the
queue is not properly incremented. Once this 'failure to increment' is
encountered, the drive remains one command behind until a target reset
occurs.

As an example, if the drive currently has a queue depth of 20 and the
drive does not increment the pointer, the drive firmware will only be
actively working on 19 commands.  In this case, it is very unlikely
that the problem will be noticed because the command buffer remains
full. Conversely, if the drive currently has a queue depth of 1 when
the problem occurs, there may be a noticeable delay and drive timeouts
as commands are not executed until the next one is received. 

Seagate has released new drive firmware for the Cheetah 4 disk drives
that corrects the circular command buffer problem. This drive firmware
is available on SunSolve in the T3 Array patches listed below.


IMPLEMENTATION:  
 
         ---
        |   |   MANDATORY (Fully Pro-Active)
         ---    
         
  
         ---
        | X |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        |   |   REACTIVE (As Required)
         ---
         

CORRECTIVE ACTION: 

The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives that may be at risk of
encountering the above mentioned problem.

A) Customers experiencing unexplained performance degradation
   accompanied by fatal drive timeout notices in the T3 syslog in 
   non-clustered environments should upgrade their T3 firmware in 
   accordance with patchId# 109115 (f/w v1.16c) or later.  
   BugId# 4411125 is fixed with the drive firmware included with 
   this patch.

OR

B) Customers experiencing unexplained performance degradation accompanied by
   fatal drive timeout notices in the T3 syslog in clustered environments 
   (e.g. SunCluster and Veritas Cluster Server) should upgrade their T3
   firmware in accordance with patchId# 110760 (f/w v1.16a) or later.

Although this problem is corrected via new drive firmware, the
associated T3 EPROM, loop card and controller firmware must also be
updated to the versions listed in the patches above, since all four T3
firmware components are tested and released together.  The patches
listed above contain complete instructions for performing all
applicable T3 firmware upgrades.


COMMENTS:  
    
------------------------------------------------------------------------------ 


Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist.  Edist can be 
  accessed internally at the following URL: http://edist.corp/.
  
* From there, follow the hyperlink path of "Enterprise Services Documenta- 
  tion" and click on "FIN & FCO attachments", then choose the
appropriate   
  folder, FIN or FCO.  This will display supporting directories/files for 
  FINs or FCOs.
   
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
---------------------------------------------------------------------------
                                                        


Copyright (c) 1997-2003 Sun Microsystems, Inc.