Document fins/I0648-1
FIN #: I0648-1
SYNOPSIS: Disks in a StorEdge A3X00
DATE: Feb/16/01
KEYWORDS: Disks in a StorEdge A3X00
---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by SunService)
SYNOPSIS: Disks in a StorEdge A3X00 Array may go offline, making
the devices unavailable.
TOP FIN/FCO REPORT: Yes
PRODUCT_REFERENCE: StorEdge A3x00 Array
PRODUCT CATEGORY: Storage / SW Admin
PRODUCTS AFFECTED:
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
Systems Affected
----------------
- ANYSYS - System Platform Independent -
X-Options Affected
------------------
SG-XARY351A-180G - - A3500 1 CONT MOD/5 TRAYS/18GB -
SG-XARY353A-1008G - - A3500 2 CONT/7 TRAYS/18GB -
SG-XARY353A-360G - - A3500 2 CONT/7 TRAYS/18GB -
SG-XARY355A-2160G - - A3500 3 CONT/15 TRAYS/18GB -
SG-XARY360A-545G - - 545-GB A3500 (1X5X9-GB) -
SG-XARY360A-90G - - A3500 1 CONT/5 TRAYS/9GB(10K) -
SG-XARY362A-180G - - A3500 2 CONT/7 TRAYS/9GB(10K) -
SG-XARY362A-763G - - A3500 2 CONT/7 TRAYS/9GB(10K) -
SG-XARY364A-1635G - - A3500 3 CONT/15 TRAYS/9GB(10K) -
SG-XARY366A-72G - - A3500 1 CONT/2 TRAYS/9GB(10K) -
SG-XARY380A-1092G - - 1092-GB A3500 (1x5x18-GB) -
SG-XARY360B-90G - - ASSY,TOP OPT,1X5X9,MIN,9GB,10K -
SG-XARY360B-545G - - ASSY,TOP OPT,1X5X9,MAX,9GB,10K -
SG-XARY362B-180G - - X-OPT,2X7X9,MIN,FCAL,9G10K -
SG-XARY374B-273G - - ASSY,TOP OPT,3X15X9,MIN,9GB,10K -
SG-XARY380B-182G - - X-OPT,FC-SN,1X5X18MIN,18GB10K -
SG-XARY380B-1092G - - ASSY,FC-SNL,1X5X18MAX,18G10K -
SG-XARY382B-364G - - ASSY,FC-SN,2X7X18,MIN,18GB,10K -
SG-XARY384B-546G - - ASSY,FC,3X15X18,MIN,18GB -
SG-XARY381B-364G - - ASSY,FC-SN,1X5X36MIN,36G10K -
SG-XARY381B-1456G - - ASSY,FC-SN,1X5X36MAX,36B10K -
SG-XARY383B-728G - - ASSY,FC-SN,2X7X36MIN,36B10K -
SG-XARY385B-1092G - - ASSY,FC-SN,3X15X36MIN,36B10K -
UG-A3500-FC-545G - - ASSY,TOP OPT,1X5X9,MAX,9GB,10K -
CU-A3500-FC-545G - - ASSY,TOP OPT,1X5X9,MAX,9GB,10K -
UG-A3500FC-182-10K - - FCTY,A3500FC/SCSI,1X5X18MIN,18/10K -
CU-A3500FC-182-10K - - FCTY,A3500FC/SCSI,1X5X18MIN,18/10K -
UG-A3500FC-364-10K - - FCTY,A3500FC/SCSI,2X7X18MIN,18/10K -
CU-A3500FC-364-10K - - FCTY,A3500FC/SCSI,2X7X18MIN,18/10K -
UG-A3500FC-546-10K - - FCTY,A3500FC/SCSI,3X15X18MIN 18G10K -
CU-A3500FC-546-10K - - FCTY,A3500FC/SCSI,3X15X18MIN 18G10K -
UG-A3K-A3500FC - - ASSY,UPGRADE,A3500FC/TABASCO -
UG-A3500-A3500FC - - ASSY,UPGRADE,A3500FC/DILBERT -
X6538A - - X-OPT,A3500FC CONTROLLER -
6538A - - FCTY, CONTROLLER, A3500FC -
PART NUMBERS AFFECTED:
Part Number Description Model
----------- ----------- -----
- - -
REFERENCES:
BugId: 4236399 - A3500 LUNs go offline without warning
PatchId: 103622 - Solaris 2.5.1: /kernel/drv/sd driver patch
105356 - Solaris 2.6: /kernel/drv/ssd and /kernel/drv/sd patch
107458 - Solaris 5.7: dad, sd, ssd, uata drivers patch
ESC: 520400
PROBLEM DESCRIPTION:
On an A3x00 disk subsystem, one or more disks may go offline unexpectedly
during operation, which can cause applications to fail or lose data.
With no warning, one to a few disks (LUNs) may go OFFLINE. Some come back
on their own, however the one which has the most hits will go offline
periodically until the system is rebooted. The problem will reappear.
Replacing disks does not solve the problem.
The following type of messages will be reported in /var/adm/messages:
....
Apr 5 03:00:58 <system> unix: WARNING:
/sbus@3,0/QLGC,isp@0,10000/sd@5,0 (sd20):
Apr 5 03:00:58 <system> unix: offline
....
Messages like those above will be the only ones logged for this issue.
Other types of problems could have "offline" messages, but, in those
cases,
there will be additional error messages in /var/adm/messages at the same time.
This problem has been seen on systems with A3500/A3000 storage systems
attached, and is exacerbated by the Raid Manager 6 (RM6) utility 'parityck'
which runs periodically. 'paritychk' does a repeated open/ioctl/close on
logical units, which creates the contention needed for this problem to occur.
Increased vulnerability to this problem occurs when there are processes/threads
all attempting to open the same disk simultaneously. This contention is
necessary for the problem to be seen (the problem occurs when a thread
accessing the disk cannot acquire a shared resource).
NOTE: This is a generic problem which could potentially be seen on other
storage devices besides the A3500 or A3000 storage systems.
The problem has been fixed by a change to the Solaris sd and sdd drivers.
Patches for Solaris 2.5.1, 2.6, and 7 are available. The fix has been
integrated into Solaris 8 and no patch is necessary.
For problem relief without applying the patches, disable the RM6 'paritychk'
utility, or schedule it to be run at times of system inactivity.
IMPLEMENTATION:
---
| | MANDATORY (Fully Pro-Active)
---
---
| | CONTROLLED PRO-ACTIVE (per Sun Geo Plan)
---
---
| X | REACTIVE (As Required)
---
CORRECTIVE ACTION:
An Authorized Enterprise Field Service Representative may avoid the
above mentioned problems by following the recommendations as shown below.
Install the latest versions of these Solaris sd and sdd driver patches:
Solaris 2.5.1 103622 (sd); 104708 (sdd)
Solaris 2.6 105356 (sd and sdd)
Solaris 7 107458 (sd and sdd)
Solaris 8 No patch needed
Because these patches update the Solaris kernel, a system reboot
is necessary after installing the patch.
COMMENTS:
-----------------------------------------------------------------------------
Implementation Footnote:
i) In case of MANDATORY FINs, Enterprise Services will attempt to
contact all affected customers to recommend implementation of
the FIN.
ii) For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the
need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/
* From there, select the appropriate link to browse the FIN or FCO index.
Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist. Edist can be
accessed internally at the following URL: http://edist.corp/.
* From there, follow the hyperlink path of "Enterprise Services Documenta-
tion" and click on "FIN & FCO attachments", then choose the
appropriate
folder, FIN or FCO. This will display supporting directories/files for
FINs or FCOs.
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
---------------------------------------------------------------------------
Copyright (c) 1997-2003 Sun Microsystems, Inc.