Monday, November 26, 2012

Adding a larger drive to a software RAID array - mdadm and lvm

The MythTV box I have in the lounge previously had two storage hard drives, in a RAID 1 configuration to prevent data loss in case of a drive failure. The drives were a 3TB Hitachi and a 2TB Samsung. I figured the Samsung drive was getting on a bit now, and it was time a new drive was installed. Might as well make it a 3TB model as well, to take advantage of all the space available on the other one that was sitting unused.

A 3TB Western Digital Red drive was picked up. I chose this as it is designed for use in a NAS environment: always on. It also has low power consumption and a good warranty. I considered a Seagate Barracuda 3TB - they were cheap, performance would be better than the Red, but they are only designed for desktop use, around 8 hours a day powered on. Warranty was pretty short as well.

Removing and replacing the old drive

The drives were configured in a software RAID 1 array, using mdadm, with lvm on top of that. This makes the array portable, and not dependent on a particular hardware controller.
The commands here were adapted from the excellent instructions found here at howtoforge.com.
Fortunately I had enough space on another PC that I was able to back up the contents of the array before starting any of this.
To remove the old drive, which on this machine was /dev/sdc, the following command was issued to mark the drive as failed, in the array /dev/md0:

sudo mdadm --manage /dev/md0 --fail /dev/sdc

Next step is to remove the drive from the array:

sudo mdadm --manage /dev/md0 --remove /dev/sdc

Then, the system could be shut down and the drive removed and replaced with the new one. After powering the system back up, the following command adds the new drive to the array:

sudo mdadm --manage /dev/md0 --add /dev/sdc

The array will then start synchronising the data, copying it to the new drive, which could take a few hours. Note that no partitioning was done on the disk, as I am just using the whole drive in the array.

While the sync is in progress, you can check how it is progressing via:

cat /proc/mdstat

It will show a percentage of completion as well as an estimated time remaining. Once it is done, the array is ready for use! I left the array like this for a day or so, just to make sure everything was working alright.

Expanding the array to fill the space available - the mdadm part

Once the synchronisation has completed, the size of the array was still only 2TB, since that is the largest a RAID 1 array could go when it consists of a 3TB and a 2TB drive. We need to tell mdadm to expand the array to fill the available space. More information on this can be found here.

This is where things got complicated for me. It is to do with the superblock format version used in the array. More detail can be found at this page of the Linux RAID wiki.

To sum up, the array I had was created with the version 0.90 superblock. The version was found by entering

sudo mdadm --detail /dev/md0

The problem, potentially, was that if I grew the array to larger than 2TB it may not work. As quoted by the wiki link above:

The version-0.90 superblock limits the number of component devices within an array to 28, and limits each component device to a maximum size of 2TB on kernel version [earlier than] 3.1 and 4TB on kernel version 3.1 [or later].

Now, Mythbuntu 12.04 runs the 3.2 kernel, so according to that it should be OK supporting up to 4TB. But I wasn't 100% sure on that, and couldn't find any references elsewhere about it. I decided the safest way to go about this was to convert the array to a later version of the superblock, that didn't have that size limitation. Besides, it would save time in the future in case of trying to repeat this with a drive larger than 4TB.

Following the suggestion of the wiki, I decided to update to a version 1.0 superblock, as it would store the superblock information in the same place as the 0.90.

Note: if you are trying this yourself, and the array is already version 1.0 or later, then the command to grow it is just as below (may not want to do it with 0.90 superblock and  larger than 2TB):

mdadm --grow /dev/md0 --size=max 

Since I was going to change the superblock version, it involved stopping the array and recreating it with the later version.

Once again, to check the details of the array at the moment:

sudo mdadm --detail /dev/md0

Now, since the array is in use by MythTV, I thought it safest to stop the program:

sudo service mythtv-backend stop

Also, I unmounted where the array was mounted:

sudo umount /var/lib/mythtv

Since the data is in an LVM volume on top of the array, I deactivated that as well (the volume is named raid1 in this instance):

sudo lvchange -a n raid1

The array is now ready to be stopped:

sudo mdadm --stop /dev/md0

Now it can be re-created, specifying the metadata (superblock) version, the RAID level, and the number and names of the drives used:

sudo mdadm --create /dev/md0 --metadata=1.0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc

The array will now start resynchronising. This took a number of hours for me, as there were around 770GB of recordings there. The RAID wiki link included --assume-clean in the above command, which would have skipped the resync. I elected to leave it out, for safety's sake.


Progress can be monitored with:

cat /proc/mdstat


The lvm volume can be restarted:

sudo lvchange -a y raid1

and the unmounted volumes can be re-mounted:

sudo mount -a

Check if they are all there with the

mount

command. The mythtv service can also be restarted:

sudo service mythtv-backend start

When the array is recreated, the UUID value of the array will be different. You can get the new value with:

sudo mdadm --detail /dev/md0

Edit the /etc/mdadm/mdadm.conf file, and change the UUID value in it to the new value. This will enable the array to be found on next boot.

Another thing to do before rebooting is to run

sudo update-initramfs -u

I didn't do this at first, and after rebooting, the array showed up named /dev/md127 rather than /dev/md0. Running the above command and rebooting again fixed it for me.

Expanding the array to fill the space available - the lvm part

Quite a long-winded process, isn't it? Using the lvm command to show the lvm physical volumes:

sudo pvdisplay

showed the array was still 1.82TiB (2TB). It needed to be extended. The following command will fill the volume to the available space:

sudo pvresize -v /dev/md0

To check the results, again run:

sudo pvdisplay

Now, running:

sudo vgdisplay

gave the following results for me:


--- Volume group ---
  VG Name               raid1
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  5
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               2.73 TiB
  PE Size               4.00 MiB
  Total PE              715397
  Alloc PE / Size       466125 / 1.78 TiB
  Free  PE / Size       249272 / 973.72 GiB
  VG UUID               gvfheX-ifvl-yW9h-v4L2-eyzs-95fe-sng2oN

Running:

sudo lvdisplay

gives the following result:

--- Logical volume ---
  LV Name                /dev/raid1/tv
  VG Name                raid1
  LV UUID                Dokbch-ZJkg-QmRW-d9vR-wfM8-BFxb-3Z0krs
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                1.70 TiB
  Current LE             445645
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           252:0

I have a couple of smaller logical volumes also in this volume group, that I have not shown. That's why there's a bit of a difference between the Alloc PE value in the volume group, and the Current LE value in the logical volume. As you can see from the lines shown in bold text, the volume group raid1 has 249272 physical extents (PE) free, and the logical volume /dev/raid1/tv is currently sized 445645. To use all space, I made the size 249272+445645, which is 694917.

The command to resize a logical volume is lvresize. Logical.

sudo lvresize -l 694917 /dev/raid1/tv

Alternatively, if you want to avoid all the maths, an alternative command is

sudo lvresize -l +100%FREE /dev/raid1/tv

That command just tells lvm to use 100% of the free space. I didn't try it myself (I only found it after running the command before it).

Now, after that has been run, to check the results, enter:

sudo lvdisplay

The results:

--- Logical volume ---
  LV Name                /dev/raid1/tv
  VG Name                raid1
  LV UUID                Dokbch-ZJkg-QmRW-d9vR-wfM8-BFxb-3Z0krs
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                2.65 TiB
  Current LE             694917
  Segments               2
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           252:0

and

sudo vgdisplay

gives:

 --- Volume group ---
  VG Name               raid1
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  6
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               2.73 TiB
  PE Size               4.00 MiB
  Total PE              715397
  Alloc PE / Size       715397 / 2.73 TiB
  Free  PE / Size       0 / 0   
  VG UUID               gvfheX-ifvl-yW9h-v4L2-eyzs-95fe-sng2oN

No free space shown; the lvm volume group is using the whole mdadm array, which in turn is using the whole of the two disks.

The final step for me was to grow the partition that is on the logical volume. I had formatted it with XFS, as it is good with large video files. XFS allows increasing size on a mounted partition, so the command used was:

sudo xfs_growfs -d /var/lib/mythtv

Finally, it is complete!

No comments:

Post a Comment