Migratie physical linux systems from SVC to direct hitachi storage

this document describes how to migrate from SVC to direct Hitachi attached. the migration consist of minimal 1 reboot and will take about 1 hour if everything is going right. for now we choose to only migrate the RHEL7 machines and RHEL6 machines. because we did not receive any setting from Hitachi we use the defaults from the linux multipath, for now no extra configuration is needed in the multipath.conf. 

we already moved some system and had a view problems. so extra steps/check are in this document and will be added if we run into more.

Step-by-step guide

Before you start, make sure you can access the console (ilo) we need to make changes in the bios. and change the root password of the system (make a backup from the original) this wil save you a lot off time if you run into fail to boot system. 

Arrange down time with the customer. tell him that it will take at least 1 hour but if there is a problem it can take longer. (multiple reboots. and it depends on the amount of storage)

we can go back to the old disks but if we do not get the same disk id we had before.

before we start, storage is migrating the old disk online to the new disks and keep them in sync. when the disks are in sync storage will provide us the new disk id’s in a mail.

  1. connect to the ilo 
  2. save the original root password and set it to a temp one. 

    reset root password
    # cat /etc/shadow|head -1 > /root/root.pw
    # passwd

    when the change is finished just replace the first line in the shadow file

  3. change the multipath config files use the ID you got in the email from storage. the lun id’s can be found in 3 files 
    /etc/multipath.conf 
    /etc/multipath/bindings
    /etc/multipath/wwids
    in the mail you find the ID it will look similiar as below

    Email storage lunid

    Volume Name   SVC ID                            HDS ID

    LSRV6226
    LSRV6226_000  600507680180867D30000000000004F4  60060e8007c5fb000030c5fb0000003a
    LSRV6226_001  600507680180867D30000000000004F5  60060e8007c5fb000030c5fb0000003b
    LSRV6226_002  600507680180867D3000000000000513  60060e8007c5fb000030c5fb0000003c

    if you look at the file you’ll notice the number storage gave us is one char short. the first 3 is missing. you need to leave this first character it’s always a 3 and make sure you don’t replace the names. Determine the boot disk id and make sure you got that one right. its usually mpatha. but it can be different. the boot disk is the one where /boot is located on it has at least 2 partition /boot and the other one is the pv for vg.root

    example diff /etc/multipath.conf

         bindings file before.                                   bindings after

     # alias wwid                                                                  # alias wwid
    #                                                                                      #
    mpatha 3600507680180867D30000000000004F4              mpatha 360060e8007c5fb000030c5fb0000003a
    mpathb 3600507680180867D30000000000004F5              mpathb 360060e8007c5fb000030c5fb0000003b
    mpathc 3600507680180867D3000000000000513              mpathc 360060e8007c5fb000030c5fb0000003c

    do the same for de wwids file, and for the multipath.conf on some rhel7 systems there are none disks id’s in the config that is not a problem because there is no blacklist filled in so all disk are default visible, for rhel6 there is blacklist so we need to change the blacklist_exceptions list.

  4. after the change we need to create a new initramfs. the easiest way to do this. is with dracut. use the following extra option

    dracut
    # dracut -f --add multipath --include /etc/multipath
  5. we had some problems on some rhel7 machines not sure if this is a bug, the dracut is using it’s own created /etc/multipath.conf and because you are still on the SVC boot disk. there is a file created in the initramfs with a black list in place and en exception list. We need to change this manualy. do this by unpacking the initramfs change the etc/multipath.conf and repack the file and copy it to the right place
    for RHEL7 use the following commands to unpack and check the initramfs.

    unpack initramfs RHEL7
    # mkdir /tmp/initram
    # cd /tmp/initram
    # /usr/lib/dracut/skipcpio /boot/initramfs-`uname -r`.img |gunzip -c | cpio -i -d
     
    # vi etc/multipath.conf       
    blacklist
    {
           wwid ".*"
    }
     
    blacklist_exceptions {
           wwid "3600507680180867d30000000000004f4"           
    }

    if you see disk blacklisted and you see an id in the exception list we need to change it. if there are no entry’s or the right entry’s you good and can leave it and shutdown the system. if you need to change is then we need to repack the initramfs and replace it.

    to pack the the initramf 

    repack initramfs RHEL7
    # find . 2>/dev/null | cpio --quiet -c -o | gzip -9 -c >"../new_initrd.img"
    # cp /tmp/new_initrd.img /boot/initramfs-`uname -r`.img


    For RHEL5/6 the multipath config is copied from the system into the initramfs when running a dracut. but it’s never wrong to do an extra check. unpacking initramfs on RHEL5/6 is slitly different. 
    use the following commands. 

    unpack initramfs RHEL5/6
    # mkdir /tmp/initram
    # cd /tmp/initram
    # zcat /boot/initramfs-`uname -r`.img| cpio -idmv
    # cat etc/multipath.conf
    # cat etc/multipath/bindings

    probably everything is okay in the initramfs so there should be no need to repack the image, but in case you had to make changes. use the following commands. 

    repack initramfs RHEL5/6
    # find . | cpio -o -c | gzip -9 -c >"../new_initrd.img"
    # cp /tmp/new_initrd.img /boot/initramfs-`uname -r`.img
  6. shutdown the system tell storage that you did and let them do their magic. 

    if you want to do something usefull with your time while waiting, get some coffee for the department i’ll like my coffee black

  7. if storage is ready they tell you to start the system. go to the console. watch the bios startup and enter the fibre bios setup. this is different per system.  (for BL gen9 systems x86 has to change the boot lun. for al other systems we can do it.) somewhere in the startup sequence you’ll see the fibre setup and have to press ctrl-e or ctrl-a to enter the bios. there is one system in think BL gen7 you need first press space to see additional startup info. so watch closely. if you enter the fibre bios, screens look different between al different versions and brands,

    there should be some entry’s about bootlun setup.

    on most system you will see for each fibre to boot id filled in, that’s because the SVC has 4 ports connected to zone with hitachi it’s only 2, so only attach the firt entry, and clear the info from the second entry. it should be obvious in de bios setup.

    if the boot lun’s are filled in leave the bios by pressing esc till you get an query to leave and save, press y.

    Remark

    I noticed that on some DL gen9 machines the machine doesn’t reboot so wait a few seconds if it’s not reseting by it self help it by pressing the reset button.


    the machine should now restart from the hitachi luns

      

  8. you should see all mpath devices. and all lv including /boot should be on a mpath device.

    check system
    df -h /boot
    Filesystem         
    Size  Used Avail Use% Mounted on
    /dev/mapper/mpatha1  240M
    165M   59M  74% /boot
    sudo multipath -ll
    mpathc (360060e8007c5fb000030c5fb0000003c) dm-5 HITACHI ,OPEN-V
    size=1.0G features='0' hwhandler='0' wp=rw
    -+- policy='service-time 0' prio=1 status=active
      |- 2:0:0:2  sdr 65:16 active ready running
      `- 1:0:0:2  sdo 8:224 active ready running
    mpathb (360060e8007c5fb000030c5fb0000003b) dm-6 HITACHI ,OPEN-V
    size=199G features='0' hwhandler='0' wp=rw
    `-+- policy='service-time 0' prio=1 status=active
      |- 1:0:0:1  sdn 8:208 active ready running
      `- 2:0:0:1  sdq 65:0  active ready running
    mpatha (360060e8007c5fb000030c5fb0000003a) dm-0 HITACHI ,OPEN-V
    size=30G features='0' hwhandler='0' wp=rw
    `-+- policy='service-time 0' prio=1 status=active
      |- 1:0:0:0  sdm 8:192 active ready running
      `- 2:0:0:0  sdp 8:240 active ready running
  9. replace the root password 

Help the system is not booting

if the system is not booting it can be all sort of thing. good luck (smile) 

If the systems is starting but the boot disk is not on a mpath device. try running dracut again and reboot 

dracut
# dracut -f --add multipath --include /etc/multipath
# shutdown -hr now

if the machine is not booting and hanging with a panic you probably forgot to check and change the etc/multipath.conf in the initramfs. you need to boot in rescue (3 option in grub menu) and rerun the dracut command again and do an extra reboot.

Een reactie plaatsen