SRDB ID   Synopsis   Date
17003   Veritas Volume Manager - How to replace a disk in an A5x00 array (Photon)   13 Jun 2002

Status Issued

Description

When replacing a failed disk under Sun Enterprise Volume Manager[TM] 2.6 (SEVM) or Veritas Volume Manager (VxVM) 3.x control in the A5x00 array(Photon), one cannot simply pull out the disk and replace it with a new one as with other types of drives. Because each disk has a unique world-wide number (WWN), the following procedure must be followed to correctly remove and replace any disk in an A5x00 (Sun Enterprise Network Array (SENA)).

Please see FIN I0741-1, "Replacement of a Disk on StorEdge A5200 may disconnect the Array", or SunAlert 40765, "Replacement of a Disk on StorEdge A5200 May Disconnect the Array", if this is an A5200, and the drive is in slot Front 10, or Rear 10 (f10,r10), it's in single loop mode and Loop A is the loop in use.

Also note that if you are experiencing difficulty with any of the procedures below, it is highly advisable to apply the latest luxadm patch for your particular operating system. Luxadm patches are as follows:

        Solaris 2.5.1 - 105310-xx
        Solaris 2.6 - 105375-xx
        Solaris 7 - 107473-xx
        Solaris 8 - 109529-xx                        

If the correct procedure below is not followed, Volume Manager may display this message when attempting to get the new disk online:

device cxtxxdxsx online failed : device path not valid

SOLUTION SUMMARY:
The following procedure illustrates the correct sequence of commands needed to
replace a disk in an A5x00 array if Volume Manager is being used.
The vxdiskadm command will bring up a menu of Volume Manager options.

1) Inform Volume Manager you wish to remove the disk:

       # vxdiskadm -> option 4 (Remove a disk for replacement)

2) Offline the device:

       # vxdiskadm -> option 11 (Disable (offline) a disk device)
          (use the c#t#d# name)
           
3) Remove the device and device nodes:

     # luxadm remove_device enclosurename,[f|r]slot#

          where: enclosure name is the name of A5x00 as 
                 reported by luxadm probe
                 
                 f- front panel
                 r- rear panel
                 
         example:  front panel slot 3 with box name = saturn
        
      # luxadm remove_device saturn,f3

	luxadm remove_device will prompt you to physically remove the device.
	If this is successful, skip to step 5.  If luxadm remove_device gives an
	error, proceed to step 4.
   

4) If removal says  "device busy" and does not continue the removal, we have three 
   things we can try in order to remove the device paths for the array.
	
	1.  force the disk removal with:
    
         # luxadm remove_device -F enclosurename,[f|r]slot#
				
			or if that doesn't work
 
	2. On the A5x00 front panel module,  manually select the
        disk to be replaced and stop it.  This is done as follows:
	
	Go to the array and bring up the main menu screen. It has 9 icons on it
	that each represent a different function one can perform on the A5x00. 
	By this point you will have had to determine what position the drive resides 
	in, and whether it is in the back or front of the array. 'luxadm display'
	should show this information.

	a. Select the appropriate icon for the side of the A5x00 where the disk 
	resides. The two icons will be on the top row of icons, in the middle 
	(front 'tray') and right (rear 'tray')positions. 

	b. The panel will display the bank of disks containing the one you want. 
	It will appear as a small rectangle on what looks like a railroad track; 
	the number of the drive will be in the box. Under the bubble containing 
	status information, you will see a button that says 'Off'. 

	c. Press this button. A new screen will come up that says 'Alert' and 
	contains a large rectangle that asks you if you want to continue. Underneath 
	the 'Alert' box will be a message indicating what the selected operation is. 
	In this case it will say "Spin down drive". 

	d. Depress the 'Continue' box. Another screen will pop in saying that 
	the drive's status is now offline. 
	
	e. Remove the drive and replace it with the new one. 

	f. At the console, issue the luxadm remove_device command again. You will 
	receive errors concerning the drive that the utility thinks is supposed to 
	be there. Ignore them. 
             
	NOTE:   If this is a multi-initiated array (i.e., if there are more than one
	systems connected to this disk drive), you must successfully remove the 
	disk from each host.  
	
	3. In extreme cases, it may be necessary to manually remove the device paths
	for the A5x00 if all of the above procedures are not working.  This procedure 
	is detailed in Infodoc 18168.  This is a workaround only at last resort.
 

5) Run the following command, and then when prompted, physically replace
   the disk.

      # luxadm insert_device enclosurename,[f|r]slot#

   to create the new device and device nodes.  Again, on a multi-initiated
   array, this command must be performed on all systems connected to the
   array.

6) Inform Volume Manager of the configuration changes with the command

      # vxdctl enable

7) Bring the disk back online in Volume Manager:

      # vxdiskadm option 5 (Replace a failed or removed disk)        

Keywords: SEVM, VxVM, Volume Manager, upgrade, upgrading, recovery, recover                                                       
INTERNAL SUMMARY:
This infodoc was from an email on storage.amb from Dava.Carta 
and worked well for me . He also included the FIN# I0354 , which 
I excerpted below :



Use these procedures for adding and replacing disks as documented in the
Platform Notes: Using luxadm Software. Below is a summary and example
of the steps for disk replacement in an SEVM environment (non-SEVM
environments would omit steps 2, 5 and 6):

Disk Replacement
----------------

    1) Identify all volumes or applications using the failing disk. If
       the volumes are mirrored or raid5 protected the disk can be
       replaced without taking the volume down. Otherwise all I/O
       to the disk MUST be stopped using the appropriate commands.

    2) If the disk is under SEVM control use vxdiskadm to replace and 
       offline a disk device.
        
    NOTE: Offline is a step temporarily needed as a workaround for
              bug id 4080975.

            # vxdiskadm
            4   Remove a disk for replacement
            11  Disable (offline) a disk device


    3) Use the luxadm command to remove the disk from the fcal loop. This
       command is interactive and will prompt you to physically
       remove the disk.

            # luxadm remove /dev/rdsk/c2t20d0s2

    4) Use the luxadm command to insert the new disk. 

            # luxadm insert ratbert,r4

    5) Notify SEVM of the new disk.

            # vxdctl enable
 
    6) Use vxdiskadm to bring the new disk into SEVM control.

            # vxdiskadm 
            5   Replace a failed or removed disk

    7) The volume can now be restored if needed.



Disk Installation
-----------------

    1) Use the luxadm command to prepare the loop for a new device.
       Physically install the new disk or disks when prompted.

            # luxadm insert

    2) Notify SEVM of the new disk.

            # vxdctl enable

    3) Use vxdiskadm to bring the new disk/s into SEVM control.

            # vxdiskadm
            1   Add or initialize one or more disks                                                                      
SUBMITTER: Matthew Teeter PATCH ID: 105310-xx, 105375-xx, 107473-xx, 109529-xx APPLIES TO: Hardware/Disk Storage Subsystem/StorEdge Disk Array/StorEdge A5000, Storage/Veritas, Storage/Volume Manager, AFO Vertical Team Docs/Storage ATTACHMENTS:


Copyright (c) 1997-2003 Sun Microsystems, Inc.