Document fins/I0505-2


FIN #: I0505-2

SYNOPSIS: UPDATED FIN. A3x00/A1000 RAID 0 LUN recovery requires stopping I/O to
          the LUN.

DATE: FIN #: I0505-2                                       Aug/09/99

KEYWORDS: UPDATED FIN. A3x00/A1000 RAID 0 LUN recovery requires stopping I/O to
          the LUN.


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                        FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)



SYNOPSIS:   UPDATED FIN. A3x00/A1000 RAID 0 LUN recovery requires 
            stopping I/O to the LUN.


TOP FIN/FCO REPORT:  Yes 
 
PRODUCT_REFERENCE:   RM 6.1.1 RAID LUN recovery
                                                              
PRODUCT CATEGORY:    Storage / SW Admin ; Software / Unbundled;

PRODUCTS AFFECTED:                  
  
Mkt_ID   Platform   Model   Description              Serial Number
------   --------   -----   -----------              -------------
Systems Affected
----------------
  -       A14        ALL    Ultra 2                         -
  -       SS1000(E)  ALL    SPARCserver 1000(E)             -
  -       SC2000(E)  ALL    SPARCcenter 2000(E)             -
  -       E3000      ALL    Ultra Enterprise 3000           -
  -       E3500      ALL    Ultra Enterprise 3500           -
  -       E4000      ALL    Ultra Enterprise 4000           -
  -       E4500      ALL    Ultra Enterprise 4500           -
  -       E5000      ALL    Ultra Enterprise 5000           -
  -       E5500      ALL    Ultra Enterprise 5500           -
  -       E6000      ALL    Ultra Enterprise 6000           -
  -       E6500      ALL    Ultra Enterprise 6500           -
  -       E10000     ALL    Ultra Enterprise 10000          -

X-Options Affected
------------------

SG-ARY131A-16GR5   -  -   16GB STOREDGE A1000 RACK      -       
SG-ARY133A-36GR5   -  -   36GB STOREDGE A1000 RACK      -      
SG-ARY135A-72GR5   -  -   72GB STOREDGE A1000 RACK      -  
SG-XARY122A-16G    -  -   16GB STOREDGE A1000           -   
SG-XARY122A-50G    -  -   50GB STOREDGE A1000           -   
SG-XARY124A-109G   -  -   109GB STOREDGE A1000          -   
SG-XARY124A-36G    -  -   36GB STOREDGE A1000           -  
SG-XARY126A-144G   -  -   144GB STOREDGE A1000          -   
SG-XARY126A-72G    -  -   72GB STOREDGE A1000           -   
SG-XARY131A-16G    -  -   16GB STOREDGE A1000 FOR RACK  -   
SG-XARY133A-36G    -  -   36GB STOREDGE A1000 FOR RACK  -  
SG-XARY135A-72G    -  -   72GB STOREDGE A1000 FOR RACK  -               
             
                    
PART NUMBERS AFFECTED: 

Part Number   Description                Model
-----------   -----------                -----
825-3869-02   SUN RSM ARRAY 2000 Manual Set
798-0188-01   RAID Manager6.1
798-0522-01   RAID Manager6.1.1
798-0522-02   RAID Manager6.1.1 Update 1   - 
798-0522-03   RAID Manager6.1.1 Update 2   - 


REFERENCES:

BugId:   4211786
ESC:     520030, 520118
MANUAL:  805-4057-11 Sun StorEdge RAID Manager 6.1.1 User's Guide
MANUAL:  805-3656-11 Sun StorEdge RAID Manager 6.1.1 Release Notes  
MANUAL:  805-4058-11 Sun StorEdge RAID Manager 6.1.1 Installation     
               	     and Support Guide for Solaris
 
        
PROBLEM DESCRIPTION:

From FIN# I0505-1;

When a RAID 0 LUN with a mounted file system experiences a failure 
that shows a "dead" LUN status, the recovery procedure needs to be
augmented.  If the Recovery Guru is unable to unmount the file system,
the recovery will not complete, even though there is a message
indicating the recovery can proceed even with mounted file systems.
Note that RAID 0 LUN's behave like non-RAID devices.


IMPLEMENTATION:

         ---
        |   |   MANDATORY (Fully proactive)
         ---    
         
  
         ---
        |   |   CONTROLLED PROACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---


        
CORRECTIVE ACTION: 

The processes doing I/O to the dead LUN need to be stopped and Recovery
Guru re-run so the unmount will complete and the recovery can proceed.
These techniques work for any RAID level, not just RAID 0.
The fuser command should be used to identify the processes doing i/o to
the LUN, eg fuser -c <mounted_file_system> or fuser <ctd_name>.
The
mounted file system name is shown in column 1 by df(1) when accessing
the ctd_name given by Recovery Guru, eg

If Recovery Guru shows /dev/dsk/c4t7d0s0, then
  % df /dev/dsk/c4t7d0s0
  /mnt/myfilesys    (/dev/dsk/c4t7d0s0 ):1434610 blocks 362108 files

shows the mounted filesystem name as /mnt/myfilesys.

Since the LUN is dead the I/O these processes are doing can never
complete successfully anyway, so the processes will have some error
condition. Therefore, killing them is logical.  One way to stop the
processes is to use the fuser(1m) command with the -k option as
described in the man page. The processes may not respond to kill if
they are in the kernel.

*Addition for FIN# I0505-2;

Alternatively when the Recovery Guru indicates a file system is still
mounted, the lockfs command can be used to allow unmounting the file
system.  As root, issue the command lockfs -h /<mounted filesystem>
followed by umount /<mounted filesystem>.  This should then allow you
to complete the recovery procedure.

If the above mentioned actions do not work, a reboot could be used to 
kill all the processes and unmount the file system.

In the similar situation where the LUN definition does get removed and
the filesystem associated with it is still mounted, the Recovery Guru
will recreate the LUN but the underlying device will not be shown
through format.  Try using the dr_hotadd.sh script to get the device
recognized.  If this is not sufficient, the alternative is to boot -r.


COMMENTS:
                
    
--------------------------------------------------------------------------
Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist.  Edist can be 
  accessed internally at the following URL: http://edist.corp/.
  
* From there, follow the hyperlink path of "Enterprise Services Documenta- 
  tion" and click on "FIN & FCO attachments", then choose the
appropriate   
  folder, FIN or FCO.  This will display supporting directories/files for 
  FINs or FCOs.
   
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.com
---------------------------------------------------------------------------
                                                                



Copyright (c) 1997-2003 Sun Microsystems, Inc.