Repairing Transactional Volumes
Because a transactional volume is a "layered" volume, consisting of a master device and logging device, and because the logging device can be shared among file systems, repairing a failed transactional volume requires special recovery tasks.
Any device errors or panics must be managed by using the command line utilities.
Panics
If a file system detects any internal inconsistencies while it is in use, it will panic the system. If the file system is configured for logging, it notifies the transactional volume that it needs to be checked at reboot. The transactional volume transitions itself to the "Hard Error" state. All other transactional volumes that share the same log device also go into the "Hard Error" state.
At reboot, fsck checks and repairs the file system and transitions the file system back to the "Okay" state. fsck completes this process for all transactional volumes listed in the /etc/vfstab file for the affected log device.
Transactional Volume Errors
If a device error occurs on either the master device or the log device while the transactional volume is processing logged data, the device transitions from the "Okay" state to the "Hard Error" state. If the device is either in the "Hard Error" or "Error" state, either a device error has occurred, or a panic has occurred.
Any devices sharing the failed log device also go the "Error" state.
Recovering From Soft Partition Problems
The following sections show how to recover configuration information for soft partitions. You should only use these techniques if all of your state database replicas have been lost and you do not have a current or accurate copy of metastat -p output, the md.cf file, or an up-to-date md.tab file.
How to Recover Configuration Data for a Soft Partition
At the beginning of each soft partition extent, a sector is used to mark the beginning of the soft partition extent. These hidden sectors are called extent headers and do not appear to the user of the soft partition. If all Solaris Volume Manager configuration is lost, the disk can be scanned in an attempt to generate the configuration data.
This procedure is a last option to recover lost soft partition configuration information. The metarecover command should only be used when you have lost both your metadb and your md.cf files, and your md.tab is lost or out of date.
Note - This procedure only works to recover soft partition information, and does not assist in recovering from other lost configurations or for recovering configuration information for other Solaris Volume Manager volumes.
Note - If your configuration included other Solaris Volume Manager volumes that were built on top of soft partitions, you should recover the soft partitions before attempting to recover the other volumes.
Configuration information about your soft partitions is stored on your devices and in your state database. Since either of these sources could be corrupt, you must tell the metarecover command which source is reliable.
First, use the metarecover command to determine whether the two sources agree. If they do agree, the metarecover command cannot be used to make any changes. If the metarecover command reports an inconsistency, however, you must examine its output carefully to determine whether the disk or the state database is corrupt, then you should use the metarecover command to rebuild the configuration based on the appropriate source.
Review the soft partition recovery information by using the metarecover command.
metarecover component-p -d }In this case, component is the c*t*d*s* name of the raw component. The -d option indicates to scan the physical slice for extent headers of soft partitions.
For more information, see the metarecover(1M) man page.
Example--Recovering Soft Partitions from On-Disk Extent Headers
# metarecover c1t1d0s1 -p -d The following soft partitions were found and will be added to your metadevice configuration. Name Size No. of Extents d10 10240 1 d11 10240 1 d12 10240 1 # metarecover c1t1d0s1 -p -d The following soft partitions were found and will be added to your metadevice configuration. Name Size No. of Extents d10 10240 1 d11 10240 1 d12 10240 1 WARNING: You are about to add one or more soft partition metadevices to your metadevice configuration. If there appears to be an error in the soft partition(s) displayed above, do NOT proceed with this recovery operation. Are you sure you want to do this (yes/no)?yes c1t1d0s1: Soft Partitions recovered from device. bash-2.05# metastat d10: Soft Partition Device: c1t1d0s1 State: Okay Size: 10240 blocks Device Start Block Dbase Reloc c1t1d0s1 0 No Yes Extent Start Block Block count 0 1 10240 d11: Soft Partition Device: c1t1d0s1 State: Okay Size: 10240 blocks Device Start Block Dbase Reloc c1t1d0s1 0 No Yes Extent Start Block Block count 0 10242 10240 d12: Soft Partition Device: c1t1d0s1 State: Okay Size: 10240 blocks Device Start Block Dbase Reloc c1t1d0s1 0 No Yes Extent Start Block Block count 0 20483 10240 |
This example recovers three soft partitions from disk, after the state database replicas were accidentally deleted.
Recovering Configuration From a Different System
You can recover a Solaris Volume Manager configuration, even onto a different system from the original. For example, assume you have a system with an external Multipack of six disks in it, and a Solaris Volume Manager configuration, including at least one state database replica, on some of those disks. If you experience a system failure, you can attach the Multipack to a different system and recover the complete configuration from the local disk set.
Note - Only recover a Solaris Volume Manager configuration onto a system with no preexisting Solaris Volume Manager configuration. Otherwise, you risk replacing a logical volume on your system with a logical volume that you are recovering, and possibly corrupting your system.
Note - This process only works to recover volumes from the local disk set.
How to Recover a Configuration
How to Recover a Configuration
Attach the disk or disks that contain the Solaris Volume Manager configuration to a system with no preexisting Solaris Volume Manager configuration.
Do a reconfiguration reboot to ensure that the system recognizes the newly added disks.
# reboot -- -r
Determine the major/minor number for a slice containing a state database replica on the newly added disks.
Use ls -lL, and note the two numbers between the group name and the date. Those are the major/minor numbers for this slice.
# ls -Ll /dev/dsk/c1t9d0s7 brw-r----- 1 root sys 32, 71 Dec 5 10:05 /dev/dsk/c1t9d0s7
If necessary, determine the major name corresponding with the major number by looking up the major number in /etc/name_to_major.
# grep " 32" /etc/name_to_major sd 32
Update the /kernel/drv/md.conf file with two commands: one command to tell Solaris Volume Manager where to find a valid state database replica on the new disks, and one command to tell it to trust the new replica and ignore any conflicting device ID information on the system.
In the line in this example that begins with mddb_bootlist1, replace the sd in the example with the major name you found in the previous step. Replace 71 in the example with the minor number you identified in Step 3.
#pragma ident "@(#)md.conf 2.1 00/07/07 SMI" # # Copyright (c) 1992-1999 by Sun Microsystems, Inc. # All rights reserved. # name="md" parent="pseudo" nmd=128 md_nsets=4; # #pragma ident "@(#)md.conf 2.1 00/07/07 SMI" # # Copyright (c) 1992-1999 by Sun Microsystems, Inc. # All rights reserved. # name="md" parent="pseudo" nmd=128 md_nsets=4; # Begin MDD database info (do not edit) mddb_bootlist1="sd:71:16:id0"; md_devid_destroy=1;# End MDD database info (do not edit)
Reboot to force Solaris Volume Manager to reload your configuration.
You will see messages similar to the following displayed to the console.
volume management starting. Dec 5 10:11:53 lexicon metadevadm: Disk movement detected Dec 5 10:11:53 lexicon metadevadm: Updating device names in Solaris Volume Manager The system is ready.
Verify your configuration by using the metadb and metastat commands.
# metadb flags first blk block count a m p luo 16 8192 /dev/dsk/c1t9d0s7 a luo 16 8192 /dev/dsk/c1t10d0s7 a luo 16 8192 /dev/dsk/c1t11d0s7 a luo 16 8192 /dev/dsk/c1t12d0s7 a luo 16 8192 /dev/dsk/c1t13d0s7 # metastat d12: RAID State: Okay Interlace: 32 blocks Size: 125685 blocks Original device: Size: 128576 blocks Device Start Block Dbase State Reloc Hot Spare c1t11d0s3 330 No Okay Yes c1t12d0s3 330 No Okay Yes c1t13d0s3 330 No Okay Yes d20: Soft Partition Device: d10 State: Okay Size: 8192 blocks Extent Start Block Block count 0 3592 8192 d21: Soft Partition Device: d10 State: Okay Size: 8192 blocks Extent Start Block Block count 0 11785 8192 d22: Soft Partition Device: d10 State: Okay Size: 8192 blocks Extent Start Block Block count 0 19978 8192 d10: Mirror Submirror 0: d0 State: Okay Submirror 1: d1 State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 82593 blocks d0: Submirror of d10 State: Okay Size: 118503 blocks Stripe 0: (interlace: 32 blocks) Device Start Block Dbase State Reloc Hot Spare c1t9d0s0 0 No Okay Yes c1t10d0s0 3591 No Okay Yes d1: Submirror of d10 State: Okay Size: 82593 blocks Stripe 0: (interlace: 32 blocks) Device Start Block Dbase State Reloc Hot Spare c1t9d0s1 0 No Okay Yes c1t10d0s1 0 No Okay Yes Device Relocation Information: Device Reloc Device ID c1t9d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS3487980000U00907AZ c1t10d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS3397070000W0090A8Q c1t11d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS3449660000U00904NZ c1t12d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS32655400007010H04J c1t13d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS3461190000701001T0 # # metadb flags first blk block count a m p luo 16 8192 /dev/dsk/c1t9d0s7 a luo 16 8192 /dev/dsk/c1t10d0s7 a luo 16 8192 /dev/dsk/c1t11d0s7 a luo 16 8192 /dev/dsk/c1t12d0s7 a luo 16 8192 /dev/dsk/c1t13d0s7 # metastat d12: RAID State: Okay Interlace: 32 blocks Size: 125685 blocks Original device: Size: 128576 blocks Device Start Block Dbase State Reloc Hot Spare c1t11d0s3 330 No Okay Yes c1t12d0s3 330 No Okay Yes c1t13d0s3 330 No Okay Yes d20: Soft Partition Device: d10 State: Okay Size: 8192 blocks Extent Start Block Block count 0 3592 8192 d21: Soft Partition Device: d10 State: Okay Size: 8192 blocks Extent Start Block Block count 0 11785 8192 d22: Soft Partition Device: d10 State: Okay Size: 8192 blocks Extent Start Block Block count 0 19978 8192 d10: Mirror Submirror 0: d0 State: Okay Submirror 1: d1 State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 82593 blocks d0: Submirror of d10 State: Okay Size: 118503 blocks Stripe 0: (interlace: 32 blocks) Device Start Block Dbase State Reloc Hot Spare c1t9d0s0 0 No Okay Yes c1t10d0s0 3591 No Okay Yes d1: Submirror of d10 State: Okay Size: 82593 blocks Stripe 0: (interlace: 32 blocks) Device Start Block Dbase State Reloc Hot Spare c1t9d0s1 0 No Okay Yes c1t10d0s1 0 No Okay Yes Device Relocation Information: Device Reloc Device ID c1t9d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS3487980000U00907AZ1 c1t10d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS3397070000W0090A8Q c1t11d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS3449660000U00904NZ c1t12d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS32655400007010H04J c1t13d0 Yes id1,sd@SSEAGATE_ST39103LCSUN9.0GLS3461190000701001T0 # metastat -p d12 -r c1t11d0s3 c1t12d0s3 c1t13d0s3 -k -i 32b d20 -p d10 -o 3592 -b 8192 d21 -p d10 -o 11785 -b 8192 d22 -p d10 -o 19978 -b 8192 d10 -m d0 d1 1 d0 1 2 c1t9d0s0 c1t10d0s0 -i 32b d1 1 2 c1t9d0s1 c1t10d0s1 -i 32b #