Thursday, December 30, 2010

Setting VXVM PATH environment variables

  • If you are using the Bourne or Korn shell (sh or ksh), use the commands:
    # PATH=$PATH:/usr/sbin:/opt/VRTS/bin:/opt/VRTSvxfs/sbin:\
    
     /opt/VRTSdbed/bin:/opt/VRTSdb2ed/bin:/opt/VRTSsybed/bin:\
    
     /opt/VRTSob/bin
    
    # MANPATH=/usr/share/man:/opt/VRTS/man:$MANPATH
    
    # export PATH MANPATH
  • If you are using a C shell (csh or tcsh), use the commands:
    # set path = ($path /usr/sbin /opt/VRTSvxfs/sbin \
    
     /opt/VRTSdbed/bin /opt/VRTSdb2ed/bin /opt/VRTSsybed/bin \
    
     /opt/VRTSob/bin /opt/VRTS/bin )
    
    # setenv MANPATH /usr/share/man:/opt/VRTS/man:$MANPATH
Note:
If you have not installed database software, you can omit /opt/VRTSdbed/bin, /opt/VRTSdb2ed/bin and /opt/VRTSsybed/bin. Similarly, /opt/VRTSvxfs/bin is only required to access some VxFS commands.
VxVM library commands and supporting scripts are located under the /usr/lib/vxvm directory hierarchy. You can include these directories in your path if you need to use them on a regular basis.
For detailed information about an individual command, refer to the appropriate manual page in the 1M section.

Commands and scripts that are provided to support other commands and scripts, and which are not intended for general use, are not located in /opt/VRTS/bin and do not have manual pages.

Recovering plexes from DISABLED RECOVER state.

Problem

Recovering plexes from DISABLED RECOVER state.

Solution

Recovering plexes from DISABLED RECOVER state once lost connectivity has been restored to one or more disks.
#vxdisk list
T41_1 sliced - - online
T41_2 sliced - - online
- - data_dg15 data_dg failed was:T41_2
- - data_dg16 data_dg failed was:T41_1
# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
T41_1 sliced - (data_dg) online
T41_2 sliced - (data_dg) online
v nas04b_v - DISABLED ACTIVE 943718400 SELECT - fsgen
pl nas04b_v-01 nas04b_v DISABLED NODEVICE 943718400 CONCAT - RW
sd data_dg16-01 nas04b_v-01 data_dg16 0 943718400 0 - NDEV
Do :
/usr/lib/vxvm/bin/vxreattach -r T41_2
/usr/lib/vxvm/bin/vxreattach -r T41_1
Migrate from DISABLED RECOVER to ENABLED ACTIVE.
Symptom:
How to recover or start a volume that has a plex in the disabled and recover state
Solution:
The vxprint -ht output gives, among other things, the kernel state (column 4)
and the state (column 5) of a plex .
Consider the following case:
# vxprint -ht -g testdg
dg testdg default default 84000 970356463.1203.alu
dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -
v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED RECOVER 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA
From the above output, it can be seen that the volume test has plex
test-01 in the DISABLED and RECOVER state.
To recover the volume test, use the vxmend command. This operation
applies only to volumes, or to plexes associated with a volume.
This will manually reset or change the state of a plex or volume.
The following is the procedure to recover/start this volume:
1. Bring plex test-01 to the DISABLED and OFFLINE state using the
following command: vxmend -o force off <recover_plex>
For example,
# vxmend -g testdg -o force off test-01
The below output shows plex test-01 in the DISABLED and OFFLINE state:
# vxprint -ht -g testdg
dg testdg default default 84000 970356463.1203.alu
dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -
v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED OFFLINE 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA
2. Bring plex test-01 to the DISABLED and STALE state using the
following command: vxmend on <recover_plex>
For example,
# vxmend -g testdg on test-01
The below output shows plex test-01 in the DISABLED and STALE state:
# vxprint -ht -g testdg
dg testdg default default 84000 970356463.1203.alu
dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -
v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED STALE 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA
3. Bring plex test-01 to the DISABLED and CLEAN state using the
following command: vxmend fix clean <recover_plex>
For example,
# vxmend -g testdg fix clean test-01
The below output shows plex test-01 in the DISABLED and CLEAN state:
# vxprint -ht -g testdg
dg testdg default default 84000 970356463.1203.alu
dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -
v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED CLEAN 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA
4. Once plex test-01 is in the DISABLED/CLEAN state, the volume test can be
started with the following command: vxvol start <volume>
For example,
# vxvol start test
It can be seen in the below output that the volume is now ENABLED and ACTIVE:
# vxprint -ht -g testdg
dg testdg default default 84000 970356463.1203.alu
dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -
v test - ENABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test ENABLED ACTIVE 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA
To summarize:
# vxmend -g testdg -o force off test-01
# vxmend -g testdg on test-01
# vxmend -g testdg fix clean test-01
# vxvol -g testdg start test

An introduction to VERITAS Volume Manager plex states and what they mean

Problem

An introduction to VERITAS Volume Manager plex states and what they mean

Solution

Empty Plex State
This is seen on a newly created volume that has not been initialized.

Clean Plex State
The plex contains a good copy of the volume data.

Note: A volume is not startable if one plex is in the CLEAN state and some plexes are in the ACTIVE state. Thus, several vxmend fix operations are normally used in conjunction to set all plexes in a volume to STALE and then to set one plex to CLEAN. A volume start operation will then enable the CLEAN plex and recover the STALE plexes by copying data from the one CLEAN plex.

Active plex State
Volume is started and the plex fully participates in the normal volume I/O operation.

Stale Plex State
The plex does not have the complete current contents. If I/O errors occur on a plex, the kernel stops using and updating this plex and the operation sets the state of the plex in STALE state.

OFFLINE Plex State
This happens when the plex is detached from the volume. Any changes to the volume are not reflected to the plex while in the OFFLINE state.

TEMP Plex State
You get this state when you add a new mirror to a volume. The plex will be in this state while it is being associated or attached (sync process). A utility will set the plex state to TEMP at the start of an operation and to an appropriate state at the end of the operation.

TEMPRM Plex State
This resembles TEMP state except that at the completion of the operation, the TEMPRM plex is removed. If the system goes down for any reason, a TEMPRM plex state indicates the operation is incomplete and a subsequent vxvol start will disassociate plexes and remove the TEMPRM plex.

TEMPRMSD Plex State
This is used by vxassist when attaching new plex. If the operation does not complete, the plex and subdisk are removed.

IOFAIL Plex State  
This is associated with persistent logging. On the detection of a failure of an ACTIVE plex, vxconfigd places that plex in the IOFAIL state so that it is disqualified from the recovery selection process a volume start time.

Please refer or consult the Volume Manager Administrator Guide for further information concerning Plex states.

vxvm:vxvol reports error when trying to start a volume

Problem

vxvm:vxvol reports error when trying to start a volume.

Error

vxvm:vxvol: ERROR: Volume has no CLEAN or non-volatile ACTIVE plexes

Solution

Description:
=========
Starting a volume reports the error above. The vxprint output shows that the plexes for the volume are in "DISABLED RECOVER" state.
Solution:
=======
The following commands must be run on a plex to change the state of the plex to "CLEAN". The volume can then be started,  but a fsck may be required before mounting the file system.

# vxmend -o force off <plex>
# vxmend on <plex>
# vxmend fix clean <plex>
# vxvol start <volume>
# fsck -F vxfs /dev/vx/rdsk/<diskgroup>/<volume>
# mount -F vxfs /dev/vx/dsk/<diskgroup>/<volume> /mountpoint


Here is an example:
The disk group dg01 has 2 volumes, apps and home. Trying to start all the volumes reported the following error:
# vxvol -g dg01 startall
vxvm:vxvol: ERROR: Volume home has no CLEAN or non-volatile ACTIVE plexes

# vxprint -g dg01 -th     <== Showed the following
...
dg dg01 2 2 123000 1021305687.1295.obp1

dm appsdisk c0t1d0s2 sliced 11555 71112735 -
dm appsmirror c1t1d0s2 sliced 11555 71112735 -
dm homedisk c2t0d0s2 sliced 14135 35349424 -
dm homemirror c3t0d0s2 sliced 14135 35349424 -

v apps - ENABLED ACTIVE 70840320 SELECT - fsgen
pl apps-01 apps ENABLED ACTIVE 70841169 CONCAT - RW
sd appsdisk-01 apps-01 appsdisk 0 70841169 0 c0t1d0s2 ENA
pl apps-02 apps ENABLED ACTIVE 70841169 CONCAT - RW
sd appsmirror-01 apps-02 appsmirror 0 70841169 0 c1t1d0s2 ENA

v home - DISABLED ACTIVE 16896000 SELECT - fsgen
pl home-01 home DISABLED RECOVER 16897232 CONCAT - RW
sd homedisk-01 home-01 homedisk 0 16897232 0 c2t0d0 RLOC
pl home-02 home DISABLED RECOVER 16897232 CONCAT - RW
sd h omemirror-01 home-02 homemirror 0 16897232 0 c3t0d0 ENA

The following commands need to be run on one of the plexes before trying to start the volume 'home':
# vxmend -o force off home-01
# vxmend on home-01
# vxmend fix clean home-01
The volume will then start successfully using the cleaned plex (the second plex, 'home-02' will automatically resync using plex 'home-01'):

# vxvol start home

Note:
It may be necessary to run fsck on the file system before mounting it:

# fsck -F vxfs /dev/vx/rdsk/<diskgroup>/<volume>
# mount -F vxfs /dev/vx/dsk/<diskgroup>/<volume> /mountpoint

How to recover and start a VXVM volume where the volume is DISABLED ACTIVE and has a plex that is DISABLED RECOVER

Problem

How to recover and start a Veritas Volume Manager logical volume where the volume is DISABLED ACTIVE and has a plex that is DISABLED RECOVER

Solution

When a system encounters a problem with a volume or a plex, or if Veritas Volume Manager (VxVM) has any reason to believe that the data is not synchronized, VxVM changes the kernel state, KSTATE and state, STATE, of the volume and its plexes accordingly. The plex state can be stale, empty, nodevice, etc. A particular plex state does not necessarily mean that the data is good or bad. The plex state is representative of VxVM's perception of the data in a plex.

The output from the vxprint utility using the switches "-h" and "-t" (for more information about these switches and all applicable switches, see the man page for vxprint) displays information from records in VxVM disk group configurations, including the KSTATE and STATE of a volume and plex as indicated in columns 4 and 5 respectively in the table below. When viewing the configuration records of a VxVM disk group using the vxprint utility and the KSTATE and STATE fields display DISABLED ACTIVE for the volume and DISABLED RECOVER  for the plex, recovery steps need to be followed to bring the volume back to an ENABLED ACTIVE state so it can be mounted and make the file system accessible again.

From the below output, it can be seen that the KSTATE and STATE for the volume test is DISABLED ACTIVE and its plex test-01 is DISABLED RECOVER.

# vxprint -ht -g testdg
 
DG NAME NCONFIG NLOG MINORS GROUP-ID    
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE  
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH USETYPE PREFPLEX RDPOL
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
               

                 
dg testdg default default 84000 970356463.1203.alu      
                 
dm testdg01 c1t4d0s2 sliced 2179 8920560 -    
dm testdg02 c1t6d0s2 sliced 2179 8920560 -    
                 
v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED RECOVER 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA
                 


Follow these steps to change KSTATE and STATE of a plex that is DISABLED RECOVER to ENABLED ACTIVE so the volume can be recovered / started and the file system mounted:

1. Change the plex test-01 to the DISABLED STALE state:
# vxmend -g  diskgroup fix stale <plex_name>

For example:
# vxmend -g testdg fix stale test-01

This output shows the plex test-01 as DISABLED STALE:
# vxprint -ht -g testdg
 
DG NAME NCONFIG NLOG MINORS GROUP-ID      
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE    
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL  
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK  
V NAME RVG KSTATE STATE LENGTH USETYPE PREFPLEX RDPOL
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
                 
dg testdg default default 84000 970356463.1203.alu      
                 
dm testdg01 c1t4d0s2 sliced 2179 8920560 -    
dm testdg02 c1t6d0s2 sliced 2179 8920560 -    
                 
v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED STALE 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA


2. Change the plex test-01 to the DISABLED CLEAN state:
# vxmend -g diskgroup fix clean <plex_name>

For example:
# vxmend -g testdg fix clean test-01

This output shows the plex test-01 as DISABLED CLEAN:
# vxprint -ht -g testdg
 
DG NAME NCONFIG NLOG MINORS GROUP-ID      
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE    
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL  
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK  
V NAME RVG KSTATE STATE LENGTH USETYPE PREFPLEX RDPOL
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
                 
dg testdg default default 84000 970356463.1203.alu      
                 
dm testdg01 c1t4d0s2 sliced 2179 8920560 -    
dm testdg02 c1t6d0s2 sliced 2179 8920560 -    
                 
v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED CLEAN 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA


3. Start the volume test:
# vxvol -g diskgroup start  <volume>

For example:
# vxvol -g diskgroup start test

This output shows that the volume test and its plex test-01 are both ENABLED ACTIVE:
# vxprint -ht -g testdg
 
DG NAME NCONFIG NLOG MINORS GROUP-ID      
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE    
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL  
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK  
V NAME RVG KSTATE STATE LENGTH USETYPE PREFPLEX RDPOL
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
                 
dg testdg default default 84000 970356463.1203.alu      
                 
dm testdg01 c1t4d0s2 sliced 2179 8920560 -    
dm testdg02 c1t6d0s2 sliced 2179 8920560 -    
                 
v test - ENABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test ENABLED ACTIVE 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA


4. Mount the volume to its associated mount point (refer to the /etc/vfstab file if the mount point location is not known) if the file system is a Veritas File System (VxFS) file system:
# mount -F vxfs /dev/vx/dsk/diskgroup/volume /mount point

For example:
# mount -F vxfs /dev/vx/dsk/testdg/test /testvol

Note: An error may be generated stating that the file system needs to be checked for consistency. If this occurs, run the VxFS specific fsck utility (/usr/lib/fs/vxfs/fsck) where the default is to replay the intent log, instead of performing a full structural file system check which is usually sufficient to set the file system to CLEAN and allow the volume to be mounted

Procedure to create a Dual bootable rootdisk and broken mirror (OS sliced disk) on Linux

Problem

Procedure to create a Dual bootable rootdisk and broken mirror (OS sliced disk) on Linux

Solution

Procedure to create a Dual bootable rootdisk and broken mirror (OS sliced disk) on Linux.
Note: The purpose of root disk encapsulation and mirroring process is to maintain redundancy in case of a disk failure. We do not test manual unencapsulation of a root mirror disk to make it bootable on disk slices for back up purposes. The procedure below will allow such operation. It is up to the admin to test and verify procedure.
*Verify what the current encapsulated bootdisk is.
 # /etc/vx/bin/vxgetrootdisk
 sda
 
**Next run the grub utility to check which hard disks contain the menu.lst. This will be important when modifying the grub menu.lst
 
#echo "find /boot/grub/menu.lst" | grub
 
grub> find /boot/grub/menu.lst
 
(hd0,0)
 
(hd1,0) <<<< This is the mirror.
 
* Current vxprint after mirroring process is completed.
 
# vxprint -g rootdg -ht
dg rootdg default default 0 1240923974.32.sprs1950a0-27
 
dm rootdisk sda auto 65535 143363997 -
 
dm rootmirr sdb auto 65535 143363997 -
 
sd meta-rootdisk-04 - rootdisk 65529072 63 METADATA sda ENA
 
sd meta-rootdisk-05 - rootdisk 69383333 63 METADATA sda ENA
 
sd meta-rootdisk-07 - rootdisk 69448932 63 METADATA sda ENA
 
sd mirmeta-rootdisk-04 - rootmirr 65529072 63 METADATA sdb ENA
 
sd mirmeta-rootdisk-05 - rootmirr 69383333 63 METADATA sdb ENA
 
sd mirmeta-rootdisk-07 - rootmirr 69448932 63 METADATA sdb ENA
 
sd mirrootdiskPriv - rootmirr 69383396 65536 PRIVATE sdb ENA
 
sd rootdiskPriv - rootdisk 69383396 65536 PRIVATE sda ENA
 
v datavol - ENABLED ACTIVE 73915002 ROUND - fsgen
 
pl datavol-01 datavol ENABLED ACTIVE 73915002 CONCAT - RW
 
sd rootdisk-03 datavol-01 rootdisk 69448995 73915002 0 sda ENA
 
pl mirdatavol-01 datavol ENABLED ACTIVE 73915002 CONCAT - RW
 
sd mirrootdisk-03 mirdatavol-01 rootmirr 69448995 73915002 0 sdb ENA
 
v rootvol - ENABLED ACTIVE 61432497 ROUND - root
 
pl mirrootvol-01 rootvol ENABLED ACTIVE 61432497 CONCAT - RW
 
sd mirrootdisk-02 mirrootvol-01 rootmirr 0 61432497 0 sdb ENA
 
pl rootvol-01 rootvol ENABLED ACTIVE 61432497 CONCAT - RW
 
sd rootdisk-02 rootvol-01 rootdisk 0 61432497 0 sda ENA
 
v swapvol - ENABLED ACTIVE 3854198 ROUND - swap
 
pl mirswapvol-01 swapvol ENABLED ACTIVE 3854198 CONCAT - RW
 
sd mirrootdisk-01 mirswapvol-01 rootmirr 65529135 3854198 0 sdb ENA
 
pl swapvol-01 swapvol ENABLED ACTIVE 3854198 CONCAT - RW
 
sd rootdisk-01 swapvol-01 rootdisk 65529135 3854198 0 sda ENA
 
*Note: Important make sure to remove the correct plex/mirror. Incorrectly removing the mirror plex will result in loss data and unbootable disk.
 
*For a test example remove root mirror sdb plexes.
 
# vxplex -g rootdg -o rm dis mirswapvol-01
 
# vxplex -g rootdg -o rm dis mirrootvol-01
 
# vxplex -g rootdg -o rm dis mirdatavol-01
 
*Remove the meta objects:
 
# vxedit -g rootdg -rf rm mirmeta-rootdisk-04
 
# vxedit -g rootdg -rf rm mirmeta-rootdisk-05
 
# vxedit -g rootdg -rf rm mirmeta-rootdisk-07
 
# vxedit -g rootdg -rf rm mirrootdiskPriv
 
*After removal of mirrors and meta objects vxprint will look as follows.
 
dg rootdg default default 0 1240923974.32.sprs1950a0-27
 
dm rootdisk sda auto 65535 143363997 -
 
dm rootmirr sdb auto 65535 143363997 -
 
sd meta-rootdisk-04 - rootdisk 65529072 63 METADATA sda ENA
 
sd meta-rootdisk-05 - rootdisk 69383333 63 METADATA sda ENA
 
sd meta-rootdisk-07 - rootdisk 69448932 63 METADATA sda ENA
 
sd rootdiskPriv - rootdisk 69383396 65536 PRIVATE sda ENA
 
v datavol - ENABLED ACTIVE 73915002 ROUND - fsgen
 
pl datavol-01 datavol ENABLED ACTIVE 73915002 CONCAT - RW
 
sd rootdisk-03 datavol-01 rootdisk 69448995 73915002 0 sda ENA
 
v rootvol - ENABLED ACTIVE 61432497 ROUND - root
 
pl rootvol-01 rootvol ENABLED ACTIVE 61432497 CONCAT - RW
 
sd rootdisk-02 rootvol-01 rootdisk 0 61432497 0 sda ENA
 
v swapvol - ENABLED ACTIVE 3854198 ROUND - swap
 
pl swapvol-01 swapvol ENABLED ACTIVE 3854198 CONCAT - RW
 
sd rootdisk-01 swapvol-01 rootdisk 65529135 3854198 0 sda ENA
 
*Now remove disk from diskgroup note after removal only one disk in diskgroup rootdg.
 
# vxdg -g rootdg rmdisk rootmirr
 
# vxdisk list|grep rootdg
 
sda auto:sliced rootdisk rootdg online
 
*Remove vxvm tags from disk.
 
# fdisk -ul /dev/sdb
 
Disk /dev/sdb: 73.4 GB, 73407820800 bytes
 
255 heads, 63 sectors/track, 8924 cylinders, total 143374650 sectors
 
Units = sectors of 1 * 512 = 512 bytes
 
Device Boot Start End Blocks Id System
 
/dev/sdb1 * 63 61432559 30716248+ 83 Linux
 
/dev/sdb2 63 143364059 71681998+ 7e Unknown <<<<<<Needs to be removed
 
/dev/sdb4 65529135 143364059 38917462+ 5 Extended
 
/dev/sdb5 65529198 69383395 1927099 82 Linux swap
 
/dev/sdb6 69383459 69448994 32768 7f Unknown <<<<<<Needs to be removed
 
/dev/sdb7 69449058 143364059 36957501 83 Linux
 
* Use fdisk utility to remove private and public partitions belonging to VxVM.
 
# fdisk /dev/sdb
 
The number of cylinders for this disk is set to 8924.
 
There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with:
 
1) software that runs at boot time (e.g., old versions of LILO)
 
2) booting and partitioning software from other OSs
 
(e.g., DOS FDISK, OS/2 FDISK)
 
Command (m for help): p
 
Disk /dev/sdb: 73.4 GB, 73407820800 bytes
 
255 heads, 63 sectors/track, 8924 cylinders
 
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Device Boot Start End Blocks Id System
 
/dev/sdb1 * 1 3824 30716248+ 83 Linux
 
/dev/sdb2 1 8924 71681998+ 7e Unknown
 
/dev/sdb4 4080 8924 38917462+ 5 Extended
 
/dev/sdb5 4080 4319 1927099 82 Linux swap
 
/dev/sdb6 4319 4323 32768 7f Unknown
 
/dev/sdb7 4324 8924 36957501 83 Linux
 
Command (m for help): d
 
Partition number (1-7): 2
 
Command (m for help): d
 
Partition number (1-7): 6
 
Command (m for help): p
 
Disk /dev/sdb: 73.4 GB, 73407820800 bytes
 
255 heads, 63 sectors/track, 8924 cylinders
 
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Device Boot Start End Blocks Id System
 
/dev/sdb1 * 1 3824 30716248+ 83 Linux
 
/dev/sdb4 4080 8924 38917462+ 5 Extended
 
/dev/sdb5 4080 4319 1927099 82 Linux swap
 
/dev/sdb6 4324 8924 36957501 83 Linux
 
Command (m for help): w
 
The partition table has been altered!
 
Calling ioctl() to re-read partition table.
 
WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
 
The kernel still uses the old table.
 
The new table will be used at the next reboot.
 
Syncing disks.
 
***Note because device is busy in kernel reboot is required to update new table.
 
*Note: Do not reboot yet. Continue with mount of mirrored disk.
 
**MODIFICATION TO BE DONE ON MIRRORED DISK SDB
 
# mount /dev/sdb1 /mnt
 
* vi the contents of fstab. Currently they will have vxvm root volume entries they will have to reflect /dev/sdb.
 
* Current view /dev/sdb fstab
 
# cat /mnt/etc/fstab
 
# This file is edited by fstab-sync - see 'man fstab-sync' for details
 
/dev/vx/dsk/bootdg/rootvol / ext3 defaults 1 1
 
none /dev/pts devpts gid=5,mode=620 0 0
 
none /dev/shm tmpfs defaults 0 0
 
none /proc proc defaults 0 0
 
none /sys sysfs defaults 0 0
 
/dev/vx/dsk/bootdg/swapvol swap swap defaults 0 0
 
/dev/vx/dsk/bootdg/datavol /data ext3 defaults 0 0
 
#NOTE: volume rootvol (/) encapsulated partition sda1
 
#NOTE: volume swapvol (swap) encapsulated partition sda5
 
#NOTE: volume datavol (/data) encapsulated partition sda7
 
/dev/hda /media/cdrecorder auto pamconsole,exec,noauto,managed 0 0
 
/dev/scd0 /media/cdrom auto pamconsole,exec,noauto,managed 0 0
 
*After modification of fstab.
 
# cat /mnt/etc/fstab
 
# This file is edited by fstab-sync - see 'man fstab-sync' for details
 
/dev/sdb1 / ext3 defaults 1 1
 
none /dev/pts devpts gid=5,mode=620 0 0
 
none /dev/shm tmpfs defaults 0 0
 
none /proc proc defaults 0 0
 
none /sys sysfs defaults 0 0
 
/dev/sdb5 swap swap defaults 0 0
 
/dev/sdb6 /data ext3 defaults 0 0
 
#NOTE: volume rootvol (/) encapsulated partition sda1
 
#NOTE: volume swapvol (swap) encapsulated partition sda5
 
#NOTE: volume datavol (/data) encapsulated partition sda7
 
/dev/hda /media/cdrecorder auto pamconsole,exec,noauto,managed 0 0
 
/dev/scd0 /media/cdrom auto pamconsole,exec,noauto,managed 0 0
 
*Next modify SDB grub menu.lst sample after modification.
 
#vi /mnt/boot/grub/menu.lst
 
# cat /boot/grub/menu.lst
 
# grub.conf generated by anaconda
 
#
 
# Note that you do not have to rerun grub after making changes to this file
 
# NOTICE: You do not have a /boot partition. This means that
 
# all kernel and initrd paths are relative to /, eg.
 
# root (hd0,0)
 
# kernel /boot/vmlinuz-version ro root=/dev/sda1
 
# initrd /boot/initrd-version.img
 
#boot=/dev/sda
 
#default=0
 
timeout=5
 
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
 
hiddenmenu
 
title Red Hat Enterprise Linux AS (2.6.9-42.ELsmp)
 
root (hd1,0)
 
kernel /boot/vmlinuz-2.6.9-42.ELsmp ro root=/dev/sdb1 elevator=deadline rhgb quiet
 
initrd /boot/initrd-2.6.9-42.ELsmp.img
 
title Red Hat Enterprise Linux AS-up (2.6.9-42.EL)
 
root (hd1,0)
 
kernel /boot/vmlinuz-2.6.9-42.EL ro root=/dev/sdb1 rhgb quiet
 
initrd /boot/initrd-2.6.9-42.EL.img
 
*Next modify the grub menu.lst file so you can boot off of sdb.
 
**Note: this will have to be done from sda point of view as this is the first disk on the scsi chain.
 
#vi /boot/grub/menu.list
 
*Current view of root disk menu.lst.
 
# cat /boot/grub/menu.lst
 
# grub.conf generated by anaconda
 
#
 
# Note that you do not have to rerun grub after making changes to this file
 
# NOTICE: You do not have a /boot partition. This means that
 
# all kernel and initrd paths are relative to /, eg.
 
# root (hd0,0)
 
# kernel /boot/vmlinuz-version ro root=/dev/sda1
 
# initrd /boot/initrd-version.img
 
#boot=/dev/sda
 
#default=0
 
timeout=5
 
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
 
hiddenmenu
 
#vxvm_root_default_START ( do not remove)
 
# Default menu entry number has been set to vxvm_root.
 
# - the vxvm_root default entry number is: 2
 
# - the original default entry number is: 0
 
# - the selected default entry number is: 0
 
# - the original grub configuration is in: /boot/grub/menu.lst.b4vxvm
 
default=2
 
#vxvm_root_default_END ( do not remove)
 
title Red Hat Enterprise Linux AS (2.6.9-42.ELsmp)
 
root (hd0,0)
 
kernel /boot/vmlinuz-2.6.9-42.ELsmp ro root=/dev/sda1 elevator=deadline rhgb quiet
 
initrd /boot/initrd-2.6.9-42.ELsmp.img
 
title Red Hat Enterprise Linux AS-up (2.6.9-42.EL)
 
root (hd0,0)
 
kernel /boot/vmlinuz-2.6.9-42.EL ro root=/dev/sda1 rhgb quiet
 
initrd /boot/initrd-2.6.9-42.EL.img
 
#vxvm_root_START ( do not remove)
 
title vxvm_root
 
root (hd0,0)
 
kernel /boot/vmlinuz-2.6.9-42.ELsmp root=c700 ro elevator=deadline rhgb quiet
 
initrd /boot/VxVM_initrd.img
 
#vxvm_root_END ( do not remove)
 
*Modified menu.lst as follows,
 
*Note: Important is to set default otherwise select proper title to boot from, at the boot screen
 
# cat /boot/grub/menu.lst
 
# grub.conf generated by anaconda
 
#
 
# Note that you do not have to rerun grub after making changes to this file
 
# NOTICE: You do not have a /boot partition. This means that
 
# all kernel and initrd paths are relative to /, eg.
 
# root (hd0,0)
 
# kernel /boot/vmlinuz-version ro root=/dev/sda1
 
# initrd /boot/initrd-version.img
 
#boot=/dev/sda
 
#default=0
 
timeout=5
 
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
 
hiddenmenu
 
#vxvm_root_default_START ( do not remove)
 
# Default menu entry number has been set to vxvm_root.
 
# - the vxvm_root default entry number is: 2
 
# - the original default entry number is: 0
 
# - the selected default entry number is: 0
 
# - the original grub configuration is in: /boot/grub/menu.lst.b4vxvm
 
default=2
 
#vxvm_root_default_END ( do not remove)
 
title Red Hat Enterprise Linux AS (2.6.9-42.ELsmp)
 
root (hd0,0)
 
kernel /boot/vmlinuz-2.6.9-42.ELsmp ro root=/dev/sda1 elevator=deadline rhgb quiet
 
initrd /boot/initrd-2.6.9-42.ELsmp.img
 
title Red Hat Enterprise Linux AS-up (2.6.9-42.EL)
 
root (hd0,0)
 
kernel /boot/vmlinuz-2.6.9-42.EL ro root=/dev/sda1 rhgb quiet
 
initrd /boot/initrd-2.6.9-42.EL.img
 
#Dual boot disk for sdb device
 
title Red Hat Enterprise Linux AS bootdisk SDB(2.6.9-42.ELsmp)
 
root (hd1,0)
 
kernel /boot/vmlinuz-2.6.9-42.ELsmp ro root=/dev/sdb1 elevator=deadline
 
initrd /boot/initrd-2.6.9-42.EL.img
 
#vxvm_root_START ( do not remove)
 
title vxvm_root
 
root (hd0,0)
 
kernel /boot/vmlinuz-2.6.9-42.ELsmp root=c700 ro elevator=deadline rhgb quiet
 
initrd /boot/VxVM_initrd.img
 
#vxvm_root_END ( do not remove)
 
*Finally System is now ready for reboot.
 
Note: Depending on what is set for the default boot in menu.lst will depend on which disk gets booted. If the mirrored disk is chosen to boot off of, then it will boot off of sliced partitions not under VxVM control; otherwise as previously noted select the correct disk when prompted to get the title selection on boot

Hardware failure for both root disk and root mirror

Problem

Both disks in the rootdg marked as failing. Could not mirror the rootvol to another internal disk due to bad blocks. The 'fsck' of the rootvol didn't work either.

Error

VxVM ERROR V-5-2-440 Unexpected error:
VxVM vxckdiskrm ERROR V-5-1-10127 disassociating disk-media rootdisk: Record is associated.
And later after removing failed disks, and after flasharchive to configure new devices - when attempted to encapsulate::
Enter disk name for c1t2d0 [,q,?] (default: rootdg01) rootdisk
A new disk group rootdg will be created and the disk device c1t2d0 will be encapsulated and added to the disk group with the disk name rootdisk.
Enter desired private region length [ ,q,?] (default: 2048)
VxVM ERROR V-5-2-338 The encapsulation operation failed with the following error:
VxVM vxencap ERROR V-5-2-213 It is not possible to encapsulate c1t3d0, for the following reason:
<VxVM vxslicer ERROR V-5-1-565 Disk contains overlapping partitions.> 

Cause

Both the rootdisk and rootmirror failing and needed to be replaced.

Solution


After unencapsulating the rootdg, the customer booted up via cd, used flasharchive to configure the one of the unused internal disks.
Then he ran into issues with the encapsulation, where we found the need to clean the device tree: Had him run 'vxdisk -e list' command.. it shows that the native device names on the right - aren't matching up with what's on the left side!  This confirmed the eed to do a device tree cleanup -- for some reason the reconfiguration reboot didn't clean this up..
---Did the device tree cleanup:
# mv /etc/vx/array.info /etc/vx/array.info.old
# mv /etc/vx/disk.info /etc/vx/disk.info.old
# rm /dev/vx/dmp/*
# rm /dev/vx/rdmp/*
# rm /dev/dsk/*
# rm /dev/rdsk/*
# devfsadm -Cv
# vxconfigd -k vxdctl enable
-The disk access names were matching the OS native names - checked with 'vxdisk -e list'
-Encapsulation was successful .
After this was done the mirroring of the rootdisk to the new rootmirror device was successful.

Removing encapsulation/volume manager temporarily, for troubleshooting boot issues

Problem

Removing encapsulation/volume manager temporarily, for troubleshooting boot issues.

Solution

There are certain situations where Sun or other vendor may need you to un-encapsulate to allow for further troubleshooting of the boot process.
It is likely that in this situation, we are not able to boot from the primary hard disk, so we will need to have some other boot media available(net, cd).

Boot from your alternate media, and mount the root slice (usually slice 0)

# mount /dev/dsk/c0t0d0s0 /mnt

Now we will need to modify several files. First we will touch a file called "install-db" into the /etc/vx/reconfig.d/state.d directory of our mounted root slice.
This will keep volume manager from starting at boot.

# touch /mnt/etc/vx/reconfig.d/state.d/install-db

Next we will need see if we have the files "vfstab.prevm" and "system.prevm"
These are files that are created when volume manager is installed, and the rootdisk is encapsulated. They are copies of the originals, before VM modifies them.

For the vfstab.prevm, we will want to check to make sure that the paths that we are booting to are slices (/dev/dsk/c0t0d0s0...) , and not volume manager paths  (/dev/vx/dmp...)

**If you do not have a vfstab.prevm, then you will need to change all of the Volume manager paths to OS device paths manually in the vfstab. If you feel comfortable with doing this, you can proceed.
Otherwise you should contact support for assistance.***

It should look similar to the following:
#device         device          mount           FS      fsck    mount   mount
#to mount       to fsck         point           type    pass    at boot options
#
fd      -       /dev/fd fd      -       no      -
/proc   -       /proc   proc    -       no      -
/dev/dsk/c0t0d0s1       -       -       swap    -       no      -
/dev/dsk/c0t0d0s0       /dev/rdsk/c0t0d0s0      /       ufs     1       no
-
/devices        -       /devices        devfs   -       no      -
sharefs -       /etc/dfs/sharetab       sharefs -       no      -
ctfs    -       /system/contract        ctfs    -       no      -
objfs   -       /system/object  objfs   -       no      -
swap    -       /tmp    tmpfs   -       yes     -
We will want to save the current vfstab to roll back to if we fix the boot issues(and they are not VM related)

# cp /mnt/etc/vfstab /mnt/etc/vfstab.bak

Now we will rename the vfstab.prevm to vfstab:

# mv /mnt/etc/vfstab.prevm /mnt/etc/vfstab

We will do the same to the system file. If you do not have a system.prevm, you can comment out the following lines with a asterisk:
rootdev:/pseudo/vxio@0:0
set vxio:vol_rootdev_is_volume=1

they should look like this once commented out:

*rootdev:/pseudo/vxio@0:0
*set vxio:vol_rootdev_is_volume=1

Make a backup of the system file, and rename system.prevm:

# cp /mnt/etc/system /mnt/etc/system.bak
# mv /mnt/etc/system.prevm /mnt/etc/system


Once the system.prevm is renamed to "system", we are ready to reboot with out volume manager involvement.

In the event that the boot issues are resolved, we can restore the backups that we made of the /etc/system, and /etc/vfstab.
The we can remove the /etc/vx/reconfig.d/state.d/install-db file, and reboot.

The system will boot encapsulated, as before, and start volume manager at boot time.

EMC Symmetrix / DMX SRDF Setup


For this setup, let’s have two different host, our local host will be R1 (Source) volumes and our remote host will be R2 (Target) volumes.
A mix of R1 and R2 volumes can reside on the same symmetrix, in short you can configure SRDF between two Symmetrix machines to act as if one was local and other was remote and vice versa.  

Step 1
Create SYMCLI Device Groups. Each group can have one or more Symmetrix devices specified in it.
SYMCLI device group information (name of the group, type, members, and any associations) are maintained in the SYMAPI database.
In the following we will create a device group that includes two SRDF volumes.
SRDF operations can be performed from the local host that has access to the source volumes or the remote host that has access to the target volumes. Therefore, both hosts should have device groups defined. 
Complete the following steps on both the local and remote hosts.
a) Identify the SRDF source and target volumes available to your assigned hosts. Execute the following commands on both the local and remote hosts.
# symrdf list pd (execute on both local and remote hosts)
or
# syminq
b) To view all the RDF volumes configured in the Symmetrix use the following
# symrdf list dev
c) Display a synopsis of the symdg command and reference it in the following steps.
# symdg –h
d) List all device groups that are currently defined.
# symdg list
e) On the local host, create a device group of the type of RDF1. On the remote host, create a device group of the type RDF2.
# symdg –type RDF1 create newsrcdg (on local host)
# symdg –type RDF2 create newtgtdg (on remote host)
f) Verify that your device group was added to the SYMAPI database on both the local and remote hosts.
# symdg list
g) Add your two devices to your device group using the symld command. Again use (–h) for a synopsis of the command syntax.
On local host:
# symld –h
# symld –g newsrcdg add dev ###
or
# symld –g newsrcdg add pd Physicaldrive#
On remote host:
# symld –g newtgtdg add dev ###
or
# symld –g newtgtdg add pd Physicaldrive#
h) Using the syminq command, identify the gatekeeper devices. Determine if it is currently defined in the SYMAPI database, if not, define it, and associate it with your device group.
On local host:
# syminq
# symgate list (Check SYMAPI)
# symgate define pd Physicaldrive# (to define)
# symgate -g newsrcdg associate pd Physicaldrive# (to associate)
On remote host:
# syminq
# symgate list (Check SYMAPI)
# symgate define pd Physicaldrive# (to define)
# symgate -g newtgtdg associate pd Physicaldrive# (to associate)
i) Display your device groups. The output is verbose so pipe it to more.
On local host:
# symdg show newsrcdg |more
On remote host:
# symdg show newtgtdg | more
j) Display a synopsis of the symld command.
# symld -h
k) Rename DEV001 to NEWVOL1
On local host:
# symld –g newsrcdg rename DEV001 NEWVOL1

On remote host:
# symld –g newtgtdg rename DEV001 NEWVOL1
l) Display the device group on both the local and remote hosts.
On local host:
# symdg show newsrcdg |more
On remote host:
# symdg show newtgtdg | more

Step 2
Use the SYMCLI to display the status of the SRDF volumes in your device group.
a) If on the local host, check the status of your SRDF volumes using the following command:
# symrdf -g newsrcdg query



Step 3
Set the default device group. You can use the “Environmental Variables” option.
# set SYMCLI_DG=newsrcdg (on the local host)
# set SYMCLI_DG=newtgtdg (on the remote host)
a) Check the SYMCLI environment.
# symcli –def (on both the local and remote hosts)
b) Test to see if the SYMCLI_DG environment variable is working properly by performing a “query” without specifying the device group.
# symrdf query (on both the local and remote hosts)

Step 4 
Changing Operational mode. The operational mode for a device or group of devices can be set dynamically with the symrdf set mode command.
a) On the local host, change the mode of operation for one of your SRDF volumes to enable semi-synchronous operations. Verify results and change back to synchronous mode.
# symrdf set mode semi NEWVOL1
# symrdf query
# symrdf set mode sync NEWVOL1
# symrdf query
b) Change mode of operation to enable adaptive copy-disk mode for all devices in the device group. Verify that the mode change occurred and then disable adaptive copy.
# symrdf set mode acp disk
# symrdf query
# symrdf set mode acp off
# symrdf query


Step 5
Check the communications link between the local and remote Symmetrix.
a) From the local host, verify that the remote Symmetrix is “alive”. If the host is attached to multiple Symmetrix, you may have to specify the Symmetrix Serial Number (SSN) through the –sid option.
# symrdf ping [ -sid xx ] (xx=last two digits of the remote SSN)
b) From the local host, display the status of the Remote Link Directors.
# symcfg –RA all list
c) From the local host, display the activity on the Remote Link Directors.
# symstat -RA all –i 10 –c 2

Step 6 
Create a partition on each disk, format the partition and assign a filesystem to the partition. Add data on the R1 volumes defined in the newsrcdg device group. 

Step 7 
Suspend RDF Link and add data to filesystem. In this step we will suspend the SRDF link, add data to the filesystem and check for invalid tracks.
a) Check that the R1 and R2 volumes are fully synchronized.
# symrdf query
b) Suspend the link between the source and target volumes.
# symrdf suspend
c) Check link status.
# symrdf query
d) Add data to the filesystems.
e) Check for invalid tracks using the following command:
# symrdf query
f) Invalid tracks can also be displayed using the symdev show command. Execute the following command on one of the devices in your device group. Look at the Mirror set information.
On the local host:
# symdev show ###
g) From the local host, resume the link and monitor invalid tracks.
# symrdf resume
# symrdf query