Several occasions have arisen where a client requested software raid-1 between two IDE drives in their server.
Obviously the servers in question had no hardware raid capabilities, and compromising disk I/O read/write times for increased redundancy was more important.
Below is a simple tutorial for setting up software raid in Linux, using MDADM. The specific examples in these instructions are with Debian, but can be applied to any linux distribution.
Linux Software RAID-1
1. Verify that you are working with two identical hard drives:
# cat /proc/ide/hd?/model
ST320413A
ST320413A
2. Copy the partition table over to the second drive:
dd if=/dev/hda of=/dev/hdc bs=1024k count=10
Edit the partition table of the second drive so that all of the partitions, except #3, have type ‘fd’.
# fdisk -l /dev/hda /dev/hdc
Disk /dev/hda (Sun disk label): 16 heads, 63 sectors, 38792 cylinders
Units = cylinders of 1008 * 512 bytes
Device Flag Start End Blocks Id System
/dev/hda1 0 248 124992 83 Linux native
/dev/hda2 248 2232 999936 83 Linux native
/dev/hda3 0 38792 19551168 5 Whole disk
/dev/hda4 2232 10169 4000248 83 Linux native
/dev/hda5 10169 12153 999936 83 Linux native
/dev/hda6 12153 18105 2999808 82 Linux swap
/dev/hda7 18105 28026 5000184 83 Linux native
/dev/hda8 28026 38792 5426064 83 Linux native
Disk /dev/hdc (Sun disk label): 16 heads, 63 sectors, 38792 cylinders
Units = cylinders of 1008 * 512 bytes
Device Flag Start End Blocks Id System
/dev/hdc1 0 248 124992 fd Linux raid autodetect
/dev/hdc2 248 2232 999936 fd Linux raid autodetect
/dev/hdc3 0 38792 19551168 5 Whole disk
/dev/hdc4 2232 10169 4000248 fd Linux raid autodetect
/dev/hdc5 10169 12153 999936 fd Linux raid autodetect
/dev/hdc6 12153 18105 2999808 fd Linux raid autodetect
/dev/hdc7 18105 28026 5000184 fd Linux raid autodetect
/dev/hdc8 28026 38792 5426064 fd Linux raid autodetect
3. Install mdadm to manage the arrays.
apt-get install mdadm
It’ll ask you a series of questions that are highly dependent on your needs. One key one is: “Yes, automatically start RAID arrays”
4. Load the RAID1 module:
modprobe raid1
5. Create the RAID1 volumes. Note that we’re setting one mirror as “missing” here. We’ll add the second half of the mirror later because we’re using it right now.
mdadm --create -n 2 -l 1 /dev/md0 /dev/hdc1 missing
mdadm --create -n 2 -l 1 /dev/md1 /dev/hdc2 missing
mdadm --create -n 2 -l 1 /dev/md2 /dev/hdc4 missing
mdadm --create -n 2 -l 1 /dev/md3 /dev/hdc5 missing
mdadm --create -n 2 -l 1 /dev/md4 /dev/hdc6 missing
mdadm --create -n 2 -l 1 /dev/md5 /dev/hdc7 missing
mdadm --create -n 2 -l 1 /dev/md6 /dev/hdc8 missing
6. Make the filesystems:
mke2fs -j /dev/md0
mke2fs -j /dev/md1
mke2fs -j /dev/md2
mke2fs -j /dev/md3
mkswap /dev/md4
mke2fs -j /dev/md5
mke2fs -j /dev/md6
7. Install the dump package:
apt-get install dump
8. Mount the new volumes, dump & restore from the running copies:
mount /dev/md1 /mnt
cd /mnt
dump 0f - / | restore rf -
rm restoresymtable
mount /dev/md0 /mnt/boot
cd /mnt/boot
dump 0f - /boot | restore rf -
rm restoresymtable
mount /dev/md2 /mnt/usr
cd /mnt/usr
dump 0f - /usr | restore rf -
rm restoresymtable
mount /dev/md3 /mnt/tmp
cd /mnt/tmp
dump 0f - /tmp | restore rf -
rm restoresymtable
mount /dev/md5 /mnt/var
cd /mnt/var
dump 0f - /var | restore rf -
rm restoresymtable
mount /dev/md6 /mnt/export
cd /mnt/export
dump 0f - /export | restore rf -
rm restoresymtable
9. Set up the chroot environment:
mount -t proc none /mnt/proc
chroot /mnt /bin/bash
10. Edit /boot/silo.conf, and change the following line:
root=/dev/md1
11. Edit /etc/fstab, and point them to the MD devices:
# /etc/fstab: static file system information.
#
#
proc /proc proc defaults 0 0
/dev/md1 / ext3 defaults,errors=remount-ro 0 1
/dev/md0 /boot ext3 defaults 0 2
/dev/md6 /export ext3 defaults 0 2
/dev/md3 /tmp ext3 defaults 0 2
/dev/md2 /usr ext3 defaults 0 2
/dev/md5 /var ext3 defaults 0 2
/dev/md4 none swap sw 0 0
/dev/hdc /media/cdrom0 iso9660 ro,user,noauto 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto 0 0
12. Save the MD information to /etc/mdadm/mdadm.conf:
echo DEVICE partitions >> /etc/mdadm/mdadm.conf
mdadm -D -s >> /etc/mdadm/mdadm.conf
13. Rebuild the initrd (to add the RAID modules, and boot/root RAID startup information):
mkinitramfs -o /boot/initrd.img-`uname -r` `uname -r`
14. Leave the chroot environment:
exit
15. Unmount /boot. klogd uses the System.map file, and we need to kill it to unmount /boot.
pkill klogd
# wait a few seconds
umount /boot
16. Add /dev/hda1 to /dev/md0 — the /boot mirror
mdadm --add /dev/md0 /dev/hda1
watch cat /proc/mdstat
Wait until the mirror is complete. CTRL-C to exit watch.
17. Mount the mirrored /boot:
umount /mnt/boot
mount /dev/md0 /boot
18. Stamp the boot loader onto both disks, and reboot:
silo -C /boot/silo.conf && reboot
19. Assuming it booted up correctly, verify that we’re running on the mirrored copies:
df -h
If so, add the other partitions into their respective mirrors:
mdadm --add /dev/md0 /dev/hda1
mdadm --add /dev/md1 /dev/hda2
mdadm --add /dev/md2 /dev/hda4
mdadm --add /dev/md3 /dev/hda5
mdadm --add /dev/md4 /dev/hda6
mdadm --add /dev/md5 /dev/hda7
mdadm --add /dev/md6 /dev/hda8
watch cat /proc/mdstat
And wait until the the mirrors are done building.
20. Edit /etc/mdadm/mdadm.conf and remove any references to the RAID volumes. Refresh the mdadm.conf information:
mdadm -D -s >> /etc/mdadm/mdadm.conf
21. Rebuild the initrd one more time. The previous time only included one half of each mirror for root and swap.
mkinitramfs -o /boot/initrd.img-`uname -r` `uname -r`
22. Reboot one more time for good measure. You now have software RAID1.
Testing the Software Raid & simulating a drive failure
Newer versions of raidtools come with a raidsetfaulty command. By using raidsetfaulty you can just simulate a drive failure without unplugging things off.
Just running the command
mdadm --manage --set-faulty /dev/md1 /dev/sdc2
First, you should see something like the first line of this on your system’s log. Something like the second line will appear if you have spare disks configured.
kernel: raid1: Disk failure on sdc2, disabling device.
kernel: md1: resyncing spare disk sdb7 to replace failed disk
Checking /proc/mdstat out will show the degraded array. If there was a spare disk available, reconstruction should have started.
Try with :
mdadm --detail /dev/md1
Now you’ve seen how it goes when a device fails. Let’s fix things up.
First, we will remove the failed disk from the array. Run the command
mdadm /dev/md1 -r /dev/sdc2
Now we have a /dev/md1 which has just lost a device. This could be a degraded RAID or perhaps a system in the middle of a reconstruction process. We wait until recovery ends before setting things back to normal.
We re-establish /dev/sdc2 back into the array.
mdadm /dev/md1 -a /dev/sdc2
As disk returns to the array, we’ll see it becoming an active member of /dev/md1 if necessary. If not, it will be marked as an spare disk.
Checking for errors and alerting
Steps for setting up e-mail alerting of errors with mdadm:
E-mail error alerting with mdadm can be accomplished in several ways:
1. Using a command line directly
2. Using the /etc/mdadm.conf file to specify an e-mail address
NOTE: e-mails are only sent when the following events occur:
Fail, FailSpare, DegradedArray, and TestMessage
Specifying an e-mail address using the mdadm command line
Using the command line simply involves including the e-mail address in the command. The following explains the mdadm command and how to set it up so that it will load every time the system is started.
mdadm --monitor --scan --daemonize --mail=jdoe@somemail.com
The command could be put /etc/init.d/boot.local so that it was loaded every time the system was started.
Verification that mdadm is running can be verified by typing the following in a terminal window:
ps aux | grep mdadm