Document fins/I0727-1


FIN #: I0727-1

SYNOPSIS: Recovering A1000/A3x00 controller C numbers after a device path
          changed due to reboot -r

DATE: Oct/17/01

KEYWORDS: Recovering A1000/A3x00 controller C numbers after a device path
          changed due to reboot -r


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)



SYNOPSIS: Recovering A1000/A3x00 controller C numbers after a device 
          path changed due to reboot -r.


Sun Alert:          No

TOP FIN/FCO REPORT: No 
 
PRODUCT_REFERENCE:  A1000/A3x00 Controllers  
 
PRODUCT CATEGORY:   Storage / Service 


PRODUCTS AFFECTED:  

Systems Affected
----------------
Mkt_ID   Platform   Model      Description                 Serial Number
------   --------   -----      -----------                 -------------
  -      ANYSYS       -        System Platform Independent       -


X-Options Affected
------------------
Mkt_ID            Platform   Model   Description                   Serial
Number
------            --------   -----   -----------                  
-------------
SG-XARY1*         A1000        -     STOREDGE A1000/RACK                 -
SG-XARY3*         A3500        -     STOREDGE A3500/RACK                 -
UG/CU-A3500FC*    A3500FC      -     ASSY,TOP OPT,1X5X9,MAX,9GB,10K      -
UG-A3K-A3500FC       -         -     ASSY,UPGRADE,A3500FC/TABASCO        -
UG-A3500-A3500FC     -         -     ASSY,UPGRADE,A3500FC/DILBERT        -
X6538A               -         -     X-OPT,A3500FC CONTROLLER            -
6538A                -         -     FCTY, CONTROLLER, A3500FC           -
X2611A               -         -     OPT INT I/O BD FOR EXX00            -
X2612A               -         -     OPT INT I/O BD EXX00 W/FC-AL        -
X2622A               -         -     OPT INT GRAPHICS I/O BD EXX00       -


PART NUMBERS AFFECTED:

Part Number   Description                             Model
-----------   -----------                             -----

704-6708-10   CD SUN STOREDGE RAID Manager6.22          -
704-7937-05   CD RM 6.22.1                              -


REFERENCES:

URL:	http://www.sun.com/storage/disk-drives/raid.html

      
PROBLEM DESCRIPTION:

Solaris device names for StorEdge A1000/A3000 controllers may sometimes
change when a reconfiguration reboot ('reboot -r') is performed.  The
controller "C" numbers can have a different values after the reboot.
When the original A1000/A3000 C numbers are lost, then volume managers
like VxVM are not able to find the LUNs.  As a result of this, the key
impact is potential loss of access to data when the C numbers change.

The A1000/A3x00 controller C numbers for array device links in /dev/dsk
and /dev/rdsk change after a reboot.  When controller C numbers are
changed, then all places where those controller numbers are recorded
must change.  For example, mount points in /etc/vfstab would need to be
changed.  However, there is no record of the changes, ie if a
controller was c2t5 and there are lots of controllers on the host, we
don't know what controller c2t5 is.  Therefore, the A1000/A3x00
controller C numbers are lost and then volume managers like VxVM are
not able to find the LUNs.

Some of the failing indications are:
 
   Can't mount /dev/dsk/c2t5d0s2       when booting the host
   Error messages from VxVm about not being able to find volume group.

Prior to a boot -r, the configuration format and 'lad' (list array
devices) shows something like the following:

     AVAILABLE DISK SELECTIONS:

       0. c0t0d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
          /sbus@3,0/SUNW,fas@3,8800000/sd@0,0
       1. c0t1d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
          /sbus@3,0/SUNW,fas@3,8800000/sd@1,0         
       2. c1t5d0 <SYMBIOS-RSMArray2000-0301 cyl 3 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,0
       3. c1t5d1 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,1
       4. c1t5d2 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,2
       5. c1t5d3 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,3
       6. c1t5d4 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@5,4
       7. c2t4d0 <SYMBIOS-RSMArray2000-0301 cyl 3 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,0
       8. c2t4d1 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,1
       9. c2t4d2 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,2
      10. c2t4d3 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,3
        
    /usr/lib/osa/bin/lad shows

    'lad' is a program that will list the names of all RAID devices
    connected  to  the system on stdout.

    c1t5d0 1T74750854 LUNS: 0 1 2 3 4 
    c2t4d0 1T71525434 LUNS: 0 1 2 3
    |____| |________| |_____________|
      |          |           |  
      v          |           |
    This field is|the device |name of a particular RAID controller
                 |           |
                 v           |
    This field is an internal|name that uniquely identifies the controller.
                             |
                             v
    These fields are a list of logical units (LUNs) currently owned by
    the controller.

NOTE: The original controller numbers for the rdac modules: 1 & 2.
 
After issuing boot -r, format and lad appear as such:

    AVAILABLE DISK SELECTIONS:

       0. c0t0d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
          /sbus@3,0/SUNW,fas@3,8800000/sd@0,0
       1. c0t1d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
          /sbus@3,0/SUNW,fas@3,8800000/sd@1,0         
       2. c69t5d0 <SYMBIOS-RSMArray2000-0301 cyl 3 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,0
       3. c69t5d1 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,1
       4. c69t5d2 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,2
       5. c69t5d3 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,3
       6. c69t5d4 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@5,4
       7. c71t4d0 <SYMBIOS-RSMArray2000-0301 cyl 3 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,0
       8. c71t4d1 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,1
       9. c71t4d2 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,2
      10. c71t4d3 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,3

    /usr/lib/osa/bin/lad shows:

    c69t5d0 1T74750854 LUNS: 0 1 2 3 4 
    c71t4d0 1T71525434 LUNS: 0 1 2 3

NOTE: The format and lad are in sync but the c#'s have been changed to 69 
      and 71.

Solaris determines the numbering of controllers based largely on the
order that were discovered during a reconfiguration boot.  Solaris 8,
update 4 seems to handle the ordering slightly differently, depending
on whether Host Bus Adapters are connected to arrays or not.  When
controllers are replaced or added dynamically the new ones are added
after the existing ones and holes are left for the missing ones.

The procedure under "Corrective Action" shows how to return to the
previous controller numbers.  The procedure is not a permanent fix in
that the change of controller numbers could happen again.  The system
administrator should keep a list of device paths, such as the output of
ls -l /dev/*dsk.

The permanent fix will be released on the next version of 6.22.


IMPLEMENTATION:  
 
         ---
        |   |   MANDATORY (Fully Pro-Active)
         ---    
         
  
         ---
        |   |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---
         

CORRECTIVE ACTION: 
        
The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above 
mentioned problem.

Starting with this situation, prior to a reconfiguration boot,
format and lad shows something like the following:

   AVAILABLE DISK SELECTIONS:

       0. c0t0d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
          /sbus@3,0/SUNW,fas@3,8800000/sd@0,0
       1. c0t1d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
          /sbus@3,0/SUNW,fas@3,8800000/sd@1,0         
       2. c1t5d0 <SYMBIOS-RSMArray2000-0301 cyl 3 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,0
       3. c1t5d1 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,1
       4. c1t5d2 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,2
       5. c1t5d3 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,3
       6. c1t5d4 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@5,4
       7. c2t4d0 <SYMBIOS-RSMArray2000-0301 cyl 3 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,0
       8. c2t4d1 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,1
       9. c2t4d2 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,2
      10. c2t4d3 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,3
                  
   /usr/lib/osa/bin/lad shows                                    

   c1t5d0 1T74750854 LUNS: 0 1 2 3 4 
   c2t4d0 1T71525434 LUNS: 0 1 2 3

Notice that the original controller numbers for the rdac modules: 1 & 2.
 
After issuing boot -r, format and lad appear as such:

   AVAILABLE DISK SELECTIONS:

       0. c0t0d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
          /sbus@3,0/SUNW,fas@3,8800000/sd@0,0
       1. c0t1d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
          /sbus@3,0/SUNW,fas@3,8800000/sd@1,0         
       2. c69t5d0 <SYMBIOS-RSMArray2000-0301 cyl 3 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,0
       3. c69t5d1 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,1
       4. c69t5d2 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,2
       5. c69t5d3 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@1/rdriver@5,3
       6. c69t5d4 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@5,4
       7. c71t4d0 <SYMBIOS-RSMArray2000-0301 cyl 3 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,0
       8. c71t4d1 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,1
       9. c71t4d2 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,2
      10. c71t4d3 <SYMBIOS-RSMArray2000-0205 cyl 48 alt 2 hd 64 sec 64>
          /pseudo/rdnexus@2/rdriver@4,3

   /usr/lib/osa/bin/lad shows:

   c69t5d0 1T74750854 LUNS: 0 1 2 3 4 
   c71t4d0 1T71525434 LUNS: 0 1 2 3

It can be noticed easily that the format and lad are in sync but the c#'s 
have been changed to 69 and 71.

To fix above problem, remove the rdac logical devices (c#t#d#) as seen
by Solaris and Raid Manager in order to recreate the logical device
controller #s.

To perform the procedure for syncing up c#'s in lad and format with 
RM6.22x and replacing c#'s back to an acceptable value:

   cd /dev/dsk
   rm c#'s for A1000/A3x00 devices
       (In this case "# rm c69*" and "# rm c71*")

   cd /dev/rdsk
   rm c#'s for A1000/A3x00 devices
       (In this case "# rm c69*" and "# rm c71*"

   cd /dev/osa/dev/dsk
   rm c#'s for A1000/A3x00 devices
       (In this case "# rm c69*" and "# rm c71*")

   cd /dev/osa/dev/rdsk
   rm c#'s for A1000/A3x00 devices
       (In this case "# rm c69*" and "# rm c71*")

Run the following rdac_disks command to remove all rdac devices from format.

   /usr/lib/osa/bin/rdac_disks   

Run the following hot_add command to recreate proper rdac device controller
#s for all of the following: format, lad, /dev/(r)dsk /dev/osa/dev/(r)dsk 
instantly with no need to reboot or boot -r.

   /usr/lib/osa/bin/hot_add

after the hot_add, everything should be as it was before, but the user
user of this procedure should verify the configuration.
 
Note: It is also possible that after a "boot -r", the rdac devices
MIGHT    
      NOT show up in format at all. Simply follow the same guidelines as 
      above, to recreate the rdac devices and sync up Solaris with Raid 
      Manager.

While tempting, do not try to run devfsadm to create links in place of
hot_add, because it will create a Solaris physical device path such as
/sbus@3,0/QLGC,isp@3... as opposed to the correct
/pseudo/rdnexus@2,0.... path that is required for the device to be
properly addressed. 


COMMENTS:  

devfsadm -C will remove links for devices that are no longer present but
this can compound the problem unless you are prepared for the controller
numbers to change.

----------------------------------------------------------------------------

Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist.  Edist can be 
  accessed internally at the following URL: http://edist.corp/.
  
* From there, follow the hyperlink path of "Enterprise Services Documenta- 
  tion" and click on "FIN & FCO attachments", then choose the
appropriate   
  folder, FIN or FCO.  This will display supporting directories/files for 
  FINs or FCOs.
   
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
---------------------------------------------------------------------------
                                                        


Copyright (c) 1997-2003 Sun Microsystems, Inc.