Document fins/I0619-1


FIN #: I0619-1

SYNOPSIS: Proper procedures for booting from StorEdge A1000 or A3x00 hardware
          RAID device, including known issues and problems

DATE: May/22/00

KEYWORDS: Proper procedures for booting from StorEdge A1000 or A3x00 hardware
          RAID device, including known issues and problems


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)



SYNOPSIS:  Proper procedures for booting from StorEdge A1000 or A3x00
           hardware RAID device, including known issues and problems. 

TOP FIN/FCO REPORT: Yes 
 
PRODUCT_REFERENCE:  StorEdge A1000 and A3X00 Arrays  
 
PRODUCT CATEGORY:   Storage / SW Admin 

PRODUCTS AFFECTED:  
  
Mkt_ID   Platform   Model   Description              Serial Number
------   --------   -----   -----------              -------------

Systems Affected
----------------

  -      ANYSYS       -     System Platform Independent    -

X-Options Affected
------------------

SG-XARY122A-16G      -   -   16GB STOREDGE A1000                  -
SG-XARY122A-50G      -   -   50GB STOREDGE A1000                  -
SG-XARY124A-109G     -   -   109GB STOREDGE A1000                 -
SG-XARY124A-36G      -   -   36GB STOREDGE A1000                  -
SG-XARY126A-144G     -   -   144GB STOREDGE A1000                 -
SG-XARY126A-72G      -   -   72GB STOREDGE A1000                  -
SG-XARY131A-16G      -   -   16GB STOREDGE A1000 FOR RACK         -
SG-XARY133A-36G      -   -   36GB STOREDGE A1000 FOR RACK         -
SG-XARY135A-72G      -   -   72GB STOREDGE A1000 FOR RACK         -
SG-XARY351A-180G     -   -   A3500 1 CONT MOD/5 TRAYS/18GB        -
SG-XARY353A-1008G    -   -   A3500 2 CONT/7 TRAYS/18GB            -
SG-XARY353A-360G     -   -   A3500 2 CONT/7 TRAYS/18GB            -
SG-XARY355A-2160G    -   -   A3500 3 CONT/15 TRAYS/18GB           -
SG-XARY360A-545G     -   -   545-GB A3500 (1X5X9-GB)              -
SG-XARY360A-90G      -   -   A3500 1 CONT/5 TRAYS/9GB(10K)        -
SG-XARY362A-180G     -   -   A3500 2 CONT/7 TRAYS/9GB(10K)        -
SG-XARY362A-763G     -   -   A3500 2 CONT/7 TRAYS/9GB(10K)        -
SG-XARY364A-1635G    -   -   A3500 3 CONT/15 TRAYS/9GB(10K)       -
SG-XARY366A-72G      -   -   A3500 1 CONT/2 TRAYS/9GB(10K)        -
SG-XARY380A-1092G    -   -   1092-GB A3500 (1x5x18-GB)            -
SG-XARY360B-90G      -   -   ASSY,TOP OPT,1X5X9,MIN,9GB,10K       -
SG-XARY360B-545G     -   -   ASSY,TOP OPT,1X5X9,MAX,9GB,10K       -
SG-XARY362B-180G     -   -   X-OPT,2X7X9,MIN,FCAL,9G10K           -
SG-XARY374B-273G     -   -   ASSY,TOP OPT,3X15X9,MIN,9GB,10K      -
SG-XARY380B-182G     -   -   X-OPT,FC-SN,1X5X18MIN,18GB10K        -
SG-XARY380B-1092G    -   -   ASSY,FC-SNL,1X5X18MAX,18G10K         -
SG-XARY382B-364G     -   -   ASSY,FC-SN,2X7X18,MIN,18GB,10K       -
SG-XARY384B-546G     -   -   ASSY,FC,3X15X18,MIN,18GB             -
SG-XARY381B-364G     -   -   ASSY,FC-SN,1X5X36MIN,36G10K          -
SG-XARY381B-1456G    -   -   ASSY,FC-SN,1X5X36MAX,36B10K          -
SG-XARY383B-728G     -   -   ASSY,FC-SN,2X7X36MIN,36B10K          -
SG-XARY385B-1092G    -   -   ASSY,FC-SN,3X15X36MIN,36B10K         -
UG-A3500-FC-545G     -   -   ASSY,TOP OPT,1X5X9,MAX,9GB,10K       -
CU-A3500-FC-545G     -   -   ASSY,TOP OPT,1X5X9,MAX,9GB,10K       -
UG-A3500FC-182-10K   -   -   FCTY,A3500FC/SCSI,1X5X18MIN,18/10K   -
CU-A3500FC-182-10K   -   -   FCTY,A3500FC/SCSI,1X5X18MIN,18/10K   -
UG-A3500FC-364-10K   -   -   FCTY,A3500FC/SCSI,2X7X18MIN,18/10K   -
CU-A3500FC-364-10K   -   -   FCTY,A3500FC/SCSI,2X7X18MIN,18/10K   -
UG-A3500FC-546-10K   -   -   FCTY,A3500FC/SCSI,3X15X18MIN 18G10K  -
CU-A3500FC-546-10K   -   -   FCTY,A3500FC/SCSI,3X15X18MIN 18G10K  -
UG-A3K-A3500FC       -   -   ASSY,UPGRADE,A3500FC/TABASCO         -
UG-A3500-A3500FC     -   -   ASSY,UPGRADE,A3500FC/DILBERT         -
X6538A               -   -   X-OPT,A3500FC CONTROLLER             -
6538A                -   -   FCTY, CONTROLLER, A3500FC            -


PART NUMBERS AFFECTED: 

Part Number   Description                             Model
-----------   -----------                             -----

798-0522-01   RAID Manager 6.1.1                        -
798-0522-02   RAID Manager6.1.1 Update 1                -
798-0522-03   RAID Manager6.1.1 Update 2                -
704-6708-10   CD, SUN STOREDGE RAID Manager6.22         -
704-7937-05   CD, SUN STOREDGE RAID Manager6.22.1       -


REFERENCES:

BugId:   4235026 - probe-scsi-all doesn't see LUNs at OBP level. 
         4240583 - probe-fcal-all does not "see" any FC A3x00 luns. 
         4291868 - Doing a reconfig reboot off of FC A3x00 will cause 
                   you to lose boot disk.
         4233846 - Limited bootability for SCSI on RM 6.22 is not 
                   working as documented .
         4354225 - RM6.22 patches 108834 and 108553 causes 
                   inability to boot from A3x00 LUN 0.
         4234427 - Cannot Boot A3500FC devices due to drivers not in 
                   OS releases.
         4328575 - A1000 as a boot device for Ex000: need support matrix.
         4191694 - PCI E450 reports - Fatal SCSI error when booting off 
                   of RAID device RM 6.1.
         4338808 - When booting from an A1000 the A3500FC luns do not 
                   show up.
         4166678 - Initial boot from A1000 (Dilbert) connected to US2D 
                   PCI card fails.
         1251360 - obp: 875 boot code does not respond to target initiated 
                   WDTR.
         4382104 - Can not force S2.6 kernel core dump if OS is on A3x00's 
                   Lun 0.
         4388578 - firmware 03010300.bwd/03010354.apd and later breaks
                   bootability.
         4289429 - Sonoma results in bad dump device during dump.
         4472109 - A1000 running 03xxxxxx firmware/ rm6.22 will not boot 
                   from E450.
	 4486082 - RM6.22x installation on A3x00 boot device fails on
	           Solaris 8, Update 4 and 5.

PatchId: 108553 for Solaris 8 
         108834 for Solaris 2.6, and 7
	 112125 RM 6.22.1 for Solaris 2.6 and 7
	 112126 RM 6.22.1 for Solaris 8

FIN:     I0551-1

ESC:     526130 - When booting from an A1000 the A3500FC luns do not show 
                  up.
         520025 - Initial boot from A1000 (Dilbert) connected to US2D PCI 
                  card fails.
         525911 - probe-fcal-all does not see any FC A3500 luns.
         526549 - System hangs frequently with A3500 as boot-device.
         528158 - boot off A1000 under S2.6 no dump/core device is 
                  available.

DOC:     Early Notifier 20029.

Manual:  805-7758-12: Sun StorEdge RAID Manager 6.22.1 Release Notes.
         806-6419-12: Sun StorEdge A3x00/A3500FC Best Practices Guide.
         806-7792-14: Sun StorEdge RAID Manager 6.22 and 6.22.1 Upgrade 
                      Guide.

      
PROBLEM DESCRIPTION:

Customers installing the Solaris Operating Environment to their 
A1000, or A3x00 hardware RAID device may encounter numerous problems
ranging from harmless error messages to an inability to access their
boot device or mount their /(root) filesystem. 

Approximately 10,538 A3x00 and A3500FC units in 1x5, 2x7, and 3x15
configurations have been shipped since January of 1998.  And
approximately 31,600 A1000 units have been shipped since January of
1998. The number of units that may be used as boot devices is unknown.

RM 6.22.1 is the recommended and supported version of RAID Manager.
Booting from an A1000, or A3x00 hardware RAID device can be problematic
and may result in different types of errors.  Below are some common
error messages that will appear if these procedures are not closely
adhered to.  Booting from an A3500FC is not supported. 

Booting from an A1000 or A3x00 array with RM6.22 installed requires patches 
108553 for Solaris 8 or 108834 for Solaris 2.6 and 7. See bug numbers
4354225 and 4289429. One of the following errors may occur if the incorrect
patch levels are used:

  SunOS Release 5.6 Version Generic_105181-21 [UNIX(R) System V Release 4.0]
  Copyright (c) 1983-1997, Sun Microsystems, Inc.
  Cannot assemble drivers for root /pseudo/rdnexus@0/rdriver@5,0:a
  Cannot mount root on /pseudo/rdnexus@0/rdriver@5,0:a fstype ufs
  panic[cpu0]/thread=0x10404000: vfs_mountroot: cannot mount root
  rebooting...
  Resetting... 
 
  Or:

  SunOS Release 5.8 Version Generic_108528-01 64-bit
  Copyright 1983-2000 Sun Microsystems, Inc.  All rights reserved.
  Cannot assemble drivers for root /pseudo/rdnexus@0/rdriver@5,0:a
  Cannot mount root on /pseudo/rdnexus@0/rdriver@5,0:a fstype ufs

  panic[cpu0]/thread=10408000: vfs_mountroot: cannot mount root

  0000000010407970 genunix:vfs_mountroot+70 (10431000, 0, 0, 10410800, 10, 14)
    %l0-3: 0000000010431000 0000000010434708 000000003fc00000 0000000010431448
    %l4-7: 0000000000000000 0000000010413468 00000000000b6322 0000000000000322
  0000000010407a20 genunix:main+94 (10410048, 2000, 10407ec0, 10408030, fff2, 
  100509ac)
    %l0-3: 0000000000000001 0000000000000001 0000000000000015 0000000000000ea1
    %l4-7: 0000000010424de0 000000001045cab8 00000000000c9610 0000000000000540

  skipping system dump - no dump device configured
  rebooting...
  Resetting... 

When booting from a hardware RAID device attached to a PCI based host,
the boot will fail with the error "trap 3e".  Refer to bugIds 4166678
and 1251360.  The workaround is to simply issue the boot command again.
The error will not occur the second time.

When booting from a hardware RAID device attached to an Enterprise 450
host, the following message may appear several times:

  Fatal SCSI error at script address 258 Unexpected disconnect
  Drive not ready

This message is harmless and will not cause any problems with the 
configuration.

Due to bugID 4472109, booting from an A3x00 or A1000 with 3.x
controller firmware attached to the dual differential PCI SCSI host bus
adaptor will not work. Work-arounds include using RM6.1.1 with firmware
2.x or using an SBus host adaptor. Refer to the bug report for more
details.

Placing your boot device under Veritas VM or Solstice DiskSuite control
may corrupt the Solaris Operating Environment and require it to be
re-installed.

Deleting LUN 0 or resetting the configuration of the module containing 
the boot device will destroy the boot device.

When booting from an array that has a single controller (A1000), or is
in the independent controller configuration, there is no RDAC failover
protection, or  a controller firmware upgrade can only be performed by
booting from an alternate boot device such as an independent disk
drive.

A Sonoma LUN can now be used as a dump device with Solaris 2.6. Bug
number 4289429 was fixed with patches 105356 (Solaris 2.6) and
107458 (Solaris 7). The fix is included in base Solaris 8 as well.

Refer to the following bug reports for the detailed root cause of the
problem.

  4388578(sonoma) firmware 03010300.bwd/03010354.apd and later breaks
                  bootability
  4289429(sonoma) Sonoma results in bad dump device during dump 
  4235026(sonoma) probe-scsi-all doesn't see LUNs at OBP level.
  4240583(fusion) probe-fcal-all does not "see" any FC A3x00 LUNs.
  4291868(kernel) Doing a reconfig reboot off of FC A3x00 will cause you 
                  to lose boot disk.
  4233846(sonoma) Limited bootability for SCSI on RM 6.22 is not working 
                  as documented.
  4354225(sonoma) RM6.22 patches 108834 and 108553 causes inability 
                  to boot from A3x00 LUN 0.
  4234427(sonoma) Cannot Boot A3500FC devices due to drivers not in OS 
                  releases.
  4328575(sonoma) A1000 as a boot device for Ex000: need support matrix.
  4191694(tazmo)  PCI E450 reports - Fatal SCSI error when booting off of 
                  RAID device RM6.1
  4338808(sonoma) When booting from an A1000 the A3500FC luns do not 
                  show up.
  4166678(fusion) Initial boot from A1000 (Dilbert) connected to US2D PCI 
                  card fails.
  1251360(tazmo)  OBP: 875 boot code does not respond to target initiated 
                  WDTR.
  4472109(tazmo)  A1000 running 03xxxxxx firmware/ rm6.22 will not boot 
	          from E450
 
Only reboot when specified in the bootability procedures or risk losing 
access to your boot device.  Following error may occur if the system is 
rebooted prematurely:

  Rebooting with command: boot -r
  Boot device: /sbus@2,0/QLGC,isp@1,10000/sd@5,0:a  File and args: -r
  SunOS Release 5.7 Version Generic_106541-10 64-bit [UNIX(R) System V 
     Release 4.0]
  Copyright (c) 1983-1999, Sun Microsystems, Inc.
  configuring network interfaces: hme0.
  Hostname: sonoma40
  mount: /dev/dsk/c0t5d0s0 is not this fstype.
  failed to open /etc/coreadm.confopen(/dev/.devfseventd_daemon.lock) - 
  Read-only file system
  Configuring /dev and /devices
  devfsadmd: mkdir failed for /dev 0x1ed: Read-only file system
  devfsadmd: open failed for /dev/.devfsadm_dev.lock: Read-only file system
  Configuring the /dev directory (compatibility devices)

The Solaris Operating Environment requires the boot device to be LUN 0.

A RAID level 0 boot device is not supported. Use RAID level 1, 3, or 5 
to enable data protection in the drive group containing the boot device.

In the independent controller configuration, only one host system can
boot because only one controller owns LUN 0.

Thefour versions of RM6 that are currently supported when installed
on an A1000, or A3x00 boot device are RM6.1.1 Update 1, RM6.1.1 Update
2, RM6.22 and RM6.22.1.
 
Solaris Operating Environment versions that are supported when
installed on an A1000, or A3x00 boot device are Solaris 2.6, Solaris 7,
and Solaris 8.

When using Solaris 8, note Update4 and Update5 do not support booting
from the A1000 and A3x000.  Earlier and later update releases do support
booting.  See bug 4486082 for details and error messages.

4486082(kernel) RM6.22x installation on A3x00 boot device fails on Solaris 8 
Update 5.

Upgrading from one version of RM6 to another when booting from an A1000
or A3x00 has not been tested and is currently not supported.  Refer to
the RM6 upgrade "guide, 806-7792-14": 

    806-7792-14 above: Sun StorEdge RAID Manager 6.22 and 6.22.1 Upgrade 
                       Guide .

For more details on upgrading RM6, refer to Early Notifier 20029 for a 
list patch levels. Refer to the Best Practice Guide for a list of supported 
platforms for booting.  For the full list of supported platforms see:

    http://acts.ebay/storage/A3x00/HOSTS.html.


IMPLEMENTATION: 
 
         ---
        |   |   MANDATORY (Fully Pro-Active)
         ---    
         
  
         ---
        |   |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---
         

CORRECTIVE ACTION: 

The following procedures are provided as a guideline for authorized
Enterprise Services Representatives that may be encountering the above
mentioned problem;

There are two procedures.  One is for RM6.1.1 versions and the other is
for RM6.22/6.22.1 One must know what version of RM6 they plan on using
prior to installing the Solaris Operating Environment.  For the latest
bug fixes and product features, RM6.22.1 is recommended.

A] The procedure for booting Solaris from a StorEdge A1000 or A3x00 Array 
   with RM6.1.1 u1 or RM6.1.1 u2 only:

   1) Install Solaris 2.6, 7, or 8 onto LUN 0 on your hardware RAID
      device and let the Solaris Installation program set your eeprom
      to boot off your RAID module.  After the OS installation, let it
      reboot off your RAID module.  The default LUN 0 has a capacity of
      only 10MB.  Please refer to the document "Sun StorEdge RAID
      Manager 6.22.1 Release Notes" for instructions on resizing the
      default LUN 0. This document also applies to RM6.1.1
      configurations.

   2) Install the recommended patch cluster for your OS.

   3) Install RM6.1.1 Update 1 or RM6.1.1 Update 2 
 
   4) Install RM6.1.1 patches (Use patchpro.ebay or EN20029 to determine
      necessary patches).

   5) Edit the /usr/lib/osa/rmparams file and make Rdac_SupportDisabled=TRUE

   6) Perform a reconfiguration reboot (reboot -r).

   7) Edit the rmparams file again and make Rdac_SupportDisabled=FALSE

   8) Run the command '/etc/init.d/rdacctrl config'

   9) Issue the command "df" and take note of what device is mounted
under /

      E.g.
      # df
      /                  (/dev/dsk/c0t5d0s0 ):17117372 blocks  1077143 files
      /proc              (/proc             ):       0 blocks    15593 files
      /dev/fd            (fd                ):       0 blocks        0 files
      /tmp               (swap              ): 3724816 blocks   164812 files

  10) ls -l /dev/dsk/cAtBdCsD (where cAtBdCsD is the device mounted under /)

      E.g.
      # ls -l /dev/dsk/c0t5d0s0
      lrwxrwxrwx  1 root other /dev/dsk/c0t5d0s0
->../../devices/pseudo/rdnexu
      s@0/rdriver@5,0:a

  11) Edit the /etc/system file and add the following entries:

      rootfs:ufs
      rootdev:/pseudo/rdnexus@0/rdriver@5,0:a

  NOTE: The rootdev: entry should be the pseudo device path of the device 
        mounted under /.

  12) Perform a reconfiguration reboot (reboot -r).


                       --- OR ---
                       

B] The procedure for booting Solaris from a StorEdge A1000 or A3x00 
   Array for RM6.22/6.22.1:

   1) Install Solaris 2.6, 7, or 8 to LUN 0 (let suninstall set the default 
      boot device to be your LUN 0). The default LUN 0 has a capacity of only 
      10MB.  Please refer to the document "Sun StorEdge RAID Manager 6.22.1
      Release Notes" for instructions on resizing the default LUN 0.

   2) Install the recommended patch cluster for your OS version.

   3) Install RM6.22/6.22.1.

   4) Install RM6.22/6.22.1 patches (Use patchpro.ebay or EN20029 to
      determine necessary patches).
      (Warning!: Do not install patches 108553, 108553, 108553
      108834, 108834 or 108834. See bug numbers 4388578 and 
      4354225)

   5) Edit the rmparams file for 16 or 32 LUN support if needed. 
      See FIN I0551.

   6) Issue the command "/usr/lib/osa/bin/genscsiconf"

   7) Edit sd.conf if needed (see FIN I0551). This is to help speed up 
      reboots.  Otherwise on reboots, the host will timeout for every 
      non-existent LUN.

   8) Run "/etc/init.d/rdacctrl config"

   9) Edit the /etc/system file and add the rootfs and rootdev entries. 
      Refer to steps 9, 10 and 11 of the above procedure "RM6.1.1 u1 & 
      RM6.1.1 u2 Only"

   10) Perform a reconfiguration reboot (reboot -- -r).


                       --- THEN --- 
 

C] Setting Up and Verifying Alternate Boot Paths (A3X00 only):

   1) From the Open Boot Prom, issue the probe-scsi-all command.  There 
      should be a SCSI ID from each RAID controller on the host. Record 
      this information including the full device path.

   2) Boot from the LUN that the OS is installed to and start RM6.

   3) Open the Recovery application and select the RAID module that is the 
      boot device and verify that it's state is Optimal.

   4) Select Options -> Manual Recovery -> Controller Pairs.

   5) Highlight the controller that owns the LUN that the OS is installed 
      on and select "Place Offline".

   6) When the controller is offline, run a healthcheck or recovery guru.  
      RM6 should report a data path failure or offline controller.  *Do Not* 
      Fix the problem at the time.

   7) Select Module Profile and confirm all LUNs are now owned by the 
      alternate controller.

   8) Bring the host down to the Open Boot Prom.

   9) Using the information from the probe-scsi-all from step 1, use the
      nvalias command to create an alias to boot from.
 
  10) boot alias -r

  11) When the host is up, start RM6. Select the Recovery application,
       and select the module that owns your boot device. You should get a 
       data path failure after running healthcheck or recovery guru. Select 
       "fix" or place Online. After this is done, run healthcheck or
recovery 
       guru again and verify that the module is once again Optimal.

  12) Open the Maintenance and Tuning application. Select the RAID module 
      that owns the boot device and select "LUN balancing". Verify that
the 
      default boot path owns LUN 0.


COMMENTS:

None  
    
----------------------------------------------------------------------------

Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist.  Edist can be 
  accessed internally at the following URL: http://edist.corp/.
  
* From there, follow the hyperlink path of "Enterprise Services Documenta- 
  tion" and click on "FIN & FCO attachments", then choose the
appropriate   
  folder, FIN or FCO.  This will display supporting directories/files for 
  FINs or FCOs.
   
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
---------------------------------------------------------------------------
                                                        




Copyright (c) 1997-2003 Sun Microsystems, Inc.