Document fins/I0650-1
FIN #: I0650-1
SYNOPSIS: Seagate drive f/w prior to A72x may affect problem on T3
DATE: May/15/01
KEYWORDS: Seagate drive f/w prior to A72x may affect problem on T3
---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by SunService)
SYNOPSIS: Sun StorEdge T3 Arrays containing Seagate Cheetah 4 18GB/36GB/73GB
drives with firmware versions prior to A72x may experience
severe performance degradation due to fatal drive timeouts.
TOP FIN/FCO REPORT: Yes
PRODUCT_REFERENCE: Cheetah Disk Drive Firmware
PRODUCT CATEGORY: Storage / SW Admin
PRODUCTS AFFECTED:
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
Systems Affected
----------------
- Anysys ALL System Platform Independent -
X-Options Affected
------------------
- T3 ALL StorEdge T3 Array -
X6713A - - FC-AL 18.2GB 10KRPM 1" Disk -
X6714A - - FC-AL 36.4GB 10KRPM 1.6" Disk -
X6716A - - FC-AL 18.2GB 10KRPM 1.6" Disk -
X6717A - - FC-AL 72.8GB 10KRPM 1.6" Disk -
PART NUMBERS AFFECTED:
----------------------
Part Number Description Model Type Vendor Firmware
----------- ----------- ----- ---- ------ --------
540-4440-01 18GB Assembly/FRU -
540-4367-01 36GB Assembly/FRU -
540-4519-01 73GB Assembly/FRU -
390-0053-01 Seagate ST318304FC 18GB ST318304FC Disk Seagate A726
390-0056-01 Seagate ST336704FC 36GB ST318304FC Disk Seagate A726
390-0036-01 Seagate ST173404FC 73GB ST318304FC Disk Seagate A727
REFERENCES:
BugId: 4411125 - Fatal drive timeouts on w/ Seagate Cheetah 4 drives
causes poor I/O perf.
PatchId: 109115: Seagate Cheetah 4 Drive Firmware Update.
110760: Seagate Cheetah 4 Drive Firmware Update.
ECO: WO_20125
ESC: 528987
529160
529745
529874
530153
530172
530174
FIN: I0609-3
URL: http://infoserver.central/data/sshandbook/Systems/T3/components.html
Sun Alert: SA-26814
PROBLEM DESCRIPTION:
Some customers have experienced serious intermittent performance
degradation associated with fatal drive timeouts on Sun StorEdge T3
Arrays with Seagate Cheetah 4 18GB/36GB/73GB disk drives. In most
cases, the fatal drive timeouts were eventually corrected by the T3
controller firmware. However, by the time corrective actions (e.g.
LIPs or Bus Resets) had completed, and the transients had subsided, the
performance impact was too large for the host might cause application
to fatal error. In most cases, application performance would return to
acceptable levels after resetting the affected T3 array(s) or host
system(s), and would eventually degrade again.
Any Sun StorEdge T3 Array with Seagate Cheetah 4 drives is susceptible
to this problem. Typical T3 syslog messages associated with this
problem are shown below:
Feb 05 08:11:46 ISR1[2]: N: u2ctr ISP2100[0] Fatal timeout on u2d5
Feb 05 08:11:46 ISR1[2]: N: u2ctr ISP2100[0] QLCF_ABORT_ALL_CMDS: Command
Timeout Pre-Gauntlet Initiated
Feb 05 08:11:46 ISR1[2]: N: u2ctr ISP2100[0] Received LIP(f7,e8) async event
Feb 05 08:11:50 ISR1[1]: N: u1ctr ISP2100[0] Received LIP(f7,e8) async event
Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] Fatal timeout on u2d5
Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] QLCF_START_GAUNTLET: Command
Timeout Gauntlet Initiated
Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] Received unexpected LIP Reset
Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] qlcf_i_mbox_cmd_complete
(cmd = 0x6c, stat = 0x4000)
Feb 05 08:12:06 ISR1[2]: N: u2d5 SVD_CHECK_ERROR: Command Timeout Detected
(path = 0)
Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] qlcf_i_mbox_cmd_complete
(cmd = 0x18, stat = 0x4000)
Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] QLCF_I_MBOX_CMD_COMPLETE:
Command Timeout Gauntlet Completed - SUCCESS
Feb 05 08:12:06 ISR1[2]: N: u2ctr ISP2100[0] Received LIP(ff,e8) async event
Feb 05 08:12:10 ISR1[1]: N: u1ctr ISP2100[0] Received unexpected LIP Reset
Feb 05 08:12:10 ISR1[1]: N: u1ctr ISP2100[0] Received LIP(ff,e8) async event
To determine the type of disk drives in a T3 array, use "disk version
u[1-2]d[1-9]" from a T3 command prompt. Examine the columns labeled
"PRODUCT" and "REVISION". If the T3 contains any of the
following drive
types with firmware versions *earlier* than those listed below, the
disks may encounter this problem:
--------------------------------------------------------
| Drive Type | PRODUCT | REVISION |
|========================================================|
| 18 Gb LP Cheetah 4 | ST318304FSUN18G | A726 |
| | | |
| 36 Gb LP Cheetah 4 | ST336704FSUN36G | A726 |
| | | |
| 73 Gb LP Cheetah 4 | ST173404FSUN73G | A727 |
--------------------------------------------------------
The Seagate Cheetah 4 drives contain a circular command buffer that
queues commands from the FC loop in the order they are received. In
certain cases, the pointer that keeps track of the current depth of the
queue is not properly incremented. Once this 'failure to increment' is
encountered, the drive remains one command behind until a target reset
occurs.
As an example, if the drive currently has a queue depth of 20 and the
drive does not increment the pointer, the drive firmware will only be
actively working on 19 commands. In this case, it is very unlikely
that the problem will be noticed because the command buffer remains
full. Conversely, if the drive currently has a queue depth of 1 when
the problem occurs, there may be a noticeable delay and drive timeouts
as commands are not executed until the next one is received.
Seagate has released new drive firmware for the Cheetah 4 disk drives
that corrects the circular command buffer problem. This drive firmware
is available on SunSolve in the T3 Array patches listed below.
IMPLEMENTATION:
---
| | MANDATORY (Fully Pro-Active)
---
---
| X | CONTROLLED PRO-ACTIVE (per Sun Geo Plan)
---
---
| | REACTIVE (As Required)
---
CORRECTIVE ACTION:
The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives that may be at risk of
encountering the above mentioned problem.
A) Customers experiencing unexplained performance degradation
accompanied by fatal drive timeout notices in the T3 syslog in
non-clustered environments should upgrade their T3 firmware in
accordance with patchId# 109115 (f/w v1.16c) or later.
BugId# 4411125 is fixed with the drive firmware included with
this patch.
OR
B) Customers experiencing unexplained performance degradation accompanied by
fatal drive timeout notices in the T3 syslog in clustered environments
(e.g. SunCluster and Veritas Cluster Server) should upgrade their T3
firmware in accordance with patchId# 110760 (f/w v1.16a) or later.
Although this problem is corrected via new drive firmware, the
associated T3 EPROM, loop card and controller firmware must also be
updated to the versions listed in the patches above, since all four T3
firmware components are tested and released together. The patches
listed above contain complete instructions for performing all
applicable T3 firmware upgrades.
COMMENTS:
------------------------------------------------------------------------------
Implementation Footnote:
i) In case of MANDATORY FINs, Enterprise Services will attempt to
contact all affected customers to recommend implementation of
the FIN.
ii) For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the
need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/
* From there, select the appropriate link to browse the FIN or FCO index.
Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist. Edist can be
accessed internally at the following URL: http://edist.corp/.
* From there, follow the hyperlink path of "Enterprise Services Documenta-
tion" and click on "FIN & FCO attachments", then choose the
appropriate
folder, FIN or FCO. This will display supporting directories/files for
FINs or FCOs.
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
---------------------------------------------------------------------------
Copyright (c) 1997-2003 Sun Microsystems, Inc.