Document fins/I0698-1


FIN #: I0698-1

SYNOPSIS: Problem on RM6.22 patches when running 'healthck' or 'drivutil'

DATE: Jul/12/01

KEYWORDS: Problem on RM6.22 patches when running 'healthck' or 'drivutil'


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)



SYNOPSIS: Installing RAID Manager 6.22 patches 108834 and 108553 
          may generate false or misleading warning messages "UNRESPONSIVE 
          DRIVES" when running the 'healthck' or 'drivutil' commands and 
          report excessive false 9501 messages.


Sun Alert         : Yes

TOP FIN/FCO REPORT: No 
 
PRODUCT_REFERENCE:  Raid Manager 6.22 patchId 108834 and 108553  
 
PRODUCT CATEGORY:   Storage / Service 


PRODUCTS AFFECTED:  
 
Mkt_ID   Platform   Model   Description                    Serial Number
------   --------   -----   -----------                    -------------
Systems Affected
----------------
  -      ANYSYS       -        System Platform Independent        -
  
X-Options Affected
------------------
  -              A1000   -   A1000 StorEdge Array                 -
  -              A3000   -   A3000 StorEdge Array                 -
  -              A3500   -   A3500 StorEdge Array                 -
SG-XARY122A-16G      -   -   16GB STOREDGE A1000                  -
SG-XARY122A-50G      -   -   50GB STOREDGE A1000                  -
SG-XARY124A-109G     -   -   109GB STOREDGE A1000                 -
SG-XARY124A-36G      -   -   36GB STOREDGE A1000                  -
SG-XARY126A-144G     -   -   144GB STOREDGE A1000                 -
SG-XARY126A-72G      -   -   72GB STOREDGE A1000                  -
SG-XARY131A-16G      -   -   16GB STOREDGE A1000 FOR RACK         -
SG-XARY133A-36G      -   -   36GB STOREDGE A1000 FOR RACK         -
SG-XARY135A-72G      -   -   72GB STOREDGE A1000 FOR RACK         -
SG-XARY351A-180G     -   -   A3500 1 CONT MOD/5 TRAYS/18GB        -
SG-XARY353A-1008G    -   -   A3500 2 CONT/7 TRAYS/18GB            -
SG-XARY353A-360G     -   -   A3500 2 CONT/7 TRAYS/18GB            -
SG-XARY355A-2160G    -   -   A3500 3 CONT/15 TRAYS/18GB           -
SG-XARY360A-545G     -   -   545-GB A3500 (1X5X9-GB)              -
SG-XARY360A-90G      -   -   A3500 1 CONT/5 TRAYS/9GB(10K)        -
SG-XARY362A-180G     -   -   A3500 2 CONT/7 TRAYS/9GB(10K)        -
SG-XARY362A-763G     -   -   A3500 2 CONT/7 TRAYS/9GB(10K)        -
SG-XARY364A-1635G    -   -   A3500 3 CONT/15 TRAYS/9GB(10K)       -
SG-XARY366A-72G      -   -   A3500 1 CONT/2 TRAYS/9GB(10K)        -
SG-XARY380A-1092G    -   -   1092-GB A3500 (1x5x18-GB)            -
UG-A3500FC-182-10K   -   -   FCTY,A3500FC/SCSI,1X5X18MIN,18/10K   -
CU-A3500FC-182-10K   -   -   FCTY,A3500FC/SCSI,1X5X18MIN,18/10K   -
UG-A3500FC-364-10K   -   -   FCTY,A3500FC/SCSI,2X7X18MIN,18/10K   -
CU-A3500FC-364-10K   -   -   FCTY,A3500FC/SCSI,2X7X18MIN,18/10K   -
UG-A3500FC-546-10K   -   -   FCTY,A3500FC/SCSI,3X15X18MIN 18G10K  -
CU-A3500FC-546-10K   -   -   FCTY,A3500FC/SCSI,3X15X18MIN 18G10K  -
UG-A3K-A3500FC       -   -   ASSY,UPGRADE,A3500FC/TABASCO         -
UG-A3500-A3500FC     -   -   ASSY,UPGRADE,A3500FC/DILBERT         -
X6538A               -   -   X-OPT,A3500FC CONTROLLER             -
6538A                -   -   FCTY, CONTROLLER, A3500FC            -
X2611A               -   -   OPT INT I/O BD FOR EXX00             -
X2612A               -   -   OPT INT I/O BD EXX00 W/FC-AL         -
X2622A               -   -   OPT INT GRAPHICS I/O BD EXX00        -


PART NUMBERS AFFECTED: 

Part Number   Description                         Model
-----------   -----------                         -----
704-6708-10   CD SUN STOREDGE RAID Manager 6.22     -


REFERENCES:

BugId:     4453774 - RM6.22 healthck is continually identifying 
                     unresponsive drives, randomly.
           4468699 - rmlogs show excessive 9501 RAID events occurring 
                     during normal running.
        
Sun Alert: SA-27541

ESC:       530337

      
PROBLEM DESCRIPTION: 
 
RAID Manager (RM) 6.22 on either of StorEdge A3X00, A3500FC or A1000
arrays may generate false or misleading and/or excessive error messages
such as "UNRESPONSIVE DRIVE - In LUN" or "9501 ASC/ASCQ" RAID
errors in
the "rmlog" files on both nodes (if clustered) after installing patches
108834 or 108553.  The errors mentioned above do not cause any
impact or degradation to customer applications.  

After upgrading to RM 6.22 and installing patches 108834 or
108553, the following error message may occur while running
'healthck' or 'drivutil -i':
 
        UNRESPONSIVE DRIVE - In LUN                        

In addition, excessive "9501 ASC/ASCQ" RAID events may be seen in the 
"rmlog" files on both nodes (if clustered), for example: 

        RM 6 Error code:

        ASC     ASCQ    Sense Key
        95      01      4                        
 
Module profiles will indicate all LUNs are optimal and the arrays are
working fine.  This implies that the 'healthck' and 'drivutil' commands
are reporting false information.  The "UNRESPONSIVE DRIVES" message may
point to random drives in different arrays at various time intervals. 

The "UNRESPONSIVE DRIVES" message is usually triggered by an I/O
inquiry such as the one sent by 'drivutil' or 'healthck'.  The problem
occurs as a result of a controller driver timing error.  If there is
I/O occurring but no I/O inquiry is done, then this message won't
appear.

The permanent fix for these spurious messages is in the controller
firmware and will be part of the RAID Manager 6.22.1 release. 


IMPLEMENTATION:  
 
         ---
        |   |   MANDATORY (Fully Pro-Active)
         ---    
         
  
         ---
        |   |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---
         

CORRECTIVE ACTION: 

An Authorized Enterprise Field Service Representative may avoid the
above mentioned problems by following the recommendations as shown
below.

   1. Ignore the random messages.

   2. Upgrade to RAID Manager 6.22.1 when it is released.  Approximate
      release date is October 15, 2001

   3. Do not install RM 6.22 patches 108834 or 108553.
      They have been "badpatched" on Sunsolve.
   
   4. If 108834 or 108553 are already installed, then there is no
      need to back them out.  Other than the false error messsages, there 
      is no harm done to the system.  Reverting to an earlier version of
      these patches (-07) would also require a downrev of the controller
      firmware.
    
   5. If the array is having any other problems other than these
      spurious messages, only then remove the -09 version of the patch 
      and install the respective -07 patch. This will require downreving
      the RDAC firmware on the controllers. The appware needs to be
      downreved from 03.01.03.63.apd to 03.01.03.60.apd but the
      bootware file is the same in both the -09 and -07 patches.

      In the case of an A3X00, this can be done by moving all the LUNs
      over to one controller and using the GUI to download the *.60
      appware file onto the other controller.  Then move the LUNs to
      that controller and download the *.60 appware file to the
      remaining controller.  Finally, rebalance the LUNs to their
      original controllers.

      In the case of an A1000, you will need to do the firmware
      download via the CLI using the fwutil command.  Again, only the
      appware file needs to be changed.   
   
NOTE: Unless 'healthck' repeatedly indicates that the same drive is
      UNRESPONSIVE, then all is OK.  However, if one drive is continually
      singled out, in fact it may be bad or failing.


COMMENTS:

None  

----------------------------------------------------------------------------

Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist.  Edist can be 
  accessed internally at the following URL: http://edist.corp/.
  
* From there, follow the hyperlink path of "Enterprise Services Documenta- 
  tion" and click on "FIN & FCO attachments", then choose the
appropriate   
  folder, FIN or FCO.  This will display supporting directories/files for 
  FINs or FCOs.
   
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
---------------------------------------------------------------------------




Copyright (c) 1997-2003 Sun Microsystems, Inc.