Software RAID in Linux

Several occasions have arisen where a client requested software raid-1 between two IDE drives in their server.

Obviously the servers in question had no hardware raid capabilities, and compromising disk I/O read/write times for increased redundancy was more important.

Below is a simple tutorial for setting up software raid in Linux, using MDADM. The specific examples in these instructions are with Debian, but can be applied to any linux distribution.

Linux Software RAID-1

1. Verify that you are working with two identical hard drives:

      # cat /proc/ide/hd?/model
      ST320413A
      ST320413A

2. Copy the partition table over to the second drive:

      dd if=/dev/hda of=/dev/hdc bs=1024k count=10

Edit the partition table of the second drive so that all of the partitions, except #3, have type ‘fd’.

      # fdisk -l /dev/hda /dev/hdc

Disk /dev/hda (Sun disk label): 16 heads, 63 sectors, 38792 cylinders
      Units = cylinders of 1008 * 512 bytes

         Device Flag    Start       End    Blocks   Id  System
      /dev/hda1             0       248    124992   83  Linux native
      /dev/hda2           248      2232    999936   83  Linux native
      /dev/hda3             0     38792  19551168    5  Whole disk
      /dev/hda4          2232     10169   4000248   83  Linux native
      /dev/hda5         10169     12153    999936   83  Linux native
      /dev/hda6         12153     18105   2999808   82  Linux swap
      /dev/hda7         18105     28026   5000184   83  Linux native
      /dev/hda8         28026     38792   5426064   83  Linux native

      Disk /dev/hdc (Sun disk label): 16 heads, 63 sectors, 38792 cylinders
      Units = cylinders of 1008 * 512 bytes

         Device Flag    Start       End    Blocks   Id  System
      /dev/hdc1             0       248    124992   fd  Linux raid autodetect
      /dev/hdc2           248      2232    999936   fd  Linux raid autodetect
      /dev/hdc3             0     38792  19551168    5  Whole disk
      /dev/hdc4          2232     10169   4000248   fd  Linux raid autodetect
      /dev/hdc5         10169     12153    999936   fd  Linux raid autodetect
      /dev/hdc6         12153     18105   2999808   fd  Linux raid autodetect
      /dev/hdc7         18105     28026   5000184   fd  Linux raid autodetect
      /dev/hdc8         28026     38792   5426064   fd  Linux raid autodetect

3. Install mdadm to manage the arrays.

      apt-get install mdadm

It’ll ask you a series of questions that are highly dependent on your needs. One key one is: “Yes, automatically start RAID arrays”

4. Load the RAID1 module:

      modprobe raid1

5. Create the RAID1 volumes. Note that we’re setting one mirror as “missing” here. We’ll add the second half of the mirror later because we’re using it right now.

      mdadm --create -n 2 -l 1 /dev/md0 /dev/hdc1 missing
      mdadm --create -n 2 -l 1 /dev/md1 /dev/hdc2 missing
      mdadm --create -n 2 -l 1 /dev/md2 /dev/hdc4 missing
      mdadm --create -n 2 -l 1 /dev/md3 /dev/hdc5 missing
      mdadm --create -n 2 -l 1 /dev/md4 /dev/hdc6 missing
      mdadm --create -n 2 -l 1 /dev/md5 /dev/hdc7 missing
      mdadm --create -n 2 -l 1 /dev/md6 /dev/hdc8 missing

6. Make the filesystems:

      mke2fs -j /dev/md0
      mke2fs -j /dev/md1
      mke2fs -j /dev/md2
      mke2fs -j /dev/md3
      mkswap /dev/md4
      mke2fs -j /dev/md5
      mke2fs -j /dev/md6

7. Install the dump package:

      apt-get install dump

8. Mount the new volumes, dump & restore from the running copies:

      mount /dev/md1 /mnt
      cd /mnt
      dump 0f - / | restore rf -
      rm restoresymtable

      mount /dev/md0 /mnt/boot
      cd /mnt/boot
      dump 0f - /boot | restore rf -
      rm restoresymtable

      mount /dev/md2 /mnt/usr
      cd /mnt/usr
      dump 0f - /usr | restore rf -
      rm restoresymtable

      mount /dev/md3 /mnt/tmp
      cd /mnt/tmp
      dump 0f - /tmp | restore rf -
      rm restoresymtable

      mount /dev/md5 /mnt/var
      cd /mnt/var
      dump 0f - /var | restore rf -
      rm restoresymtable

      mount /dev/md6 /mnt/export
      cd /mnt/export
      dump 0f - /export | restore rf -
      rm restoresymtable

9. Set up the chroot environment:

      mount -t proc none /mnt/proc

      chroot /mnt /bin/bash

10. Edit /boot/silo.conf, and change the following line:

      root=/dev/md1

11. Edit /etc/fstab, and point them to the MD devices:

      # /etc/fstab: static file system information.
      #
      #                
      proc            /proc           proc    defaults        0       0
      /dev/md1        /               ext3    defaults,errors=remount-ro 0       1
      /dev/md0        /boot           ext3    defaults        0       2
      /dev/md6        /export         ext3    defaults        0       2
      /dev/md3        /tmp            ext3    defaults        0       2
      /dev/md2        /usr            ext3    defaults        0       2
      /dev/md5        /var            ext3    defaults        0       2
      /dev/md4        none            swap    sw              0       0
      /dev/hdc        /media/cdrom0   iso9660 ro,user,noauto  0       0
      /dev/fd0        /media/floppy0  auto    rw,user,noauto  0       0

12. Save the MD information to /etc/mdadm/mdadm.conf:

      echo DEVICE partitions >> /etc/mdadm/mdadm.conf

      mdadm -D -s >> /etc/mdadm/mdadm.conf

13. Rebuild the initrd (to add the RAID modules, and boot/root RAID startup information):

     mkinitramfs -o /boot/initrd.img-`uname -r` `uname -r`

14. Leave the chroot environment:

      exit

15. Unmount /boot. klogd uses the System.map file, and we need to kill it to unmount /boot.

      pkill klogd
      # wait a few seconds
      umount /boot

16. Add /dev/hda1 to /dev/md0 — the /boot mirror

      mdadm --add /dev/md0 /dev/hda1
      watch cat /proc/mdstat

Wait until the mirror is complete. CTRL-C to exit watch.

17. Mount the mirrored /boot:

      umount /mnt/boot
      mount /dev/md0 /boot

18. Stamp the boot loader onto both disks, and reboot:

      silo -C /boot/silo.conf && reboot

19. Assuming it booted up correctly, verify that we’re running on the mirrored copies:

      df -h

If so, add the other partitions into their respective mirrors:

      mdadm --add /dev/md0 /dev/hda1
      mdadm --add /dev/md1 /dev/hda2
      mdadm --add /dev/md2 /dev/hda4
      mdadm --add /dev/md3 /dev/hda5
      mdadm --add /dev/md4 /dev/hda6
      mdadm --add /dev/md5 /dev/hda7
      mdadm --add /dev/md6 /dev/hda8
      watch cat /proc/mdstat

And wait until the the mirrors are done building.

20. Edit /etc/mdadm/mdadm.conf and remove any references to the RAID volumes. Refresh the mdadm.conf information:

      mdadm -D -s >> /etc/mdadm/mdadm.conf

21. Rebuild the initrd one more time. The previous time only included one half of each mirror for root and swap.

      mkinitramfs -o /boot/initrd.img-`uname -r` `uname -r`

22. Reboot one more time for good measure. You now have software RAID1.

Testing the Software Raid & simulating a drive failure

Newer versions of raidtools come with a raidsetfaulty command. By using raidsetfaulty you can just simulate a drive failure without unplugging things off.

Just running the command

mdadm --manage --set-faulty /dev/md1 /dev/sdc2

First, you should see something like the first line of this on your system’s log. Something like the second line will appear if you have spare disks configured.

kernel: raid1: Disk failure on sdc2, disabling device. 
kernel: md1: resyncing spare disk sdb7 to replace failed disk

Checking /proc/mdstat out will show the degraded array. If there was a spare disk available, reconstruction should have started.

Try with :

mdadm --detail /dev/md1

Now you’ve seen how it goes when a device fails. Let’s fix things up.

First, we will remove the failed disk from the array. Run the command

mdadm /dev/md1 -r /dev/sdc2

Now we have a /dev/md1 which has just lost a device. This could be a degraded RAID or perhaps a system in the middle of a reconstruction process. We wait until recovery ends before setting things back to normal.

We re-establish /dev/sdc2 back into the array.

mdadm /dev/md1 -a /dev/sdc2

As disk returns to the array, we’ll see it becoming an active member of /dev/md1 if necessary. If not, it will be marked as an spare disk.

Checking for errors and alerting

Steps for setting up e-mail alerting of errors with mdadm:

E-mail error alerting with mdadm can be accomplished in several ways:

1. Using a command line directly

2. Using the /etc/mdadm.conf file to specify an e-mail address

NOTE: e-mails are only sent when the following events occur:

Fail, FailSpare, DegradedArray, and TestMessage

Specifying an e-mail address using the mdadm command line

Using the command line simply involves including the e-mail address in the command. The following explains the mdadm command and how to set it up so that it will load every time the system is started.

mdadm --monitor --scan --daemonize --mail=jdoe@somemail.com

The command could be put /etc/init.d/boot.local so that it was loaded every time the system was started.

Verification that mdadm is running can be verified by typing the following in a terminal window:

ps aux | grep mdadm