User Tools

Site Tools


user:kkrauss1:portfolio:raid

Advanced Data and Storage(RAID)

BackGround

  • For this project we are going to play with RAID. So what is RAID? Well its a good bug killer but it is also an acronym for “Redundant Array of Independent Disks”. RAID allows for multiple disk drives to be combined into one logical unit. The data can be stored on the drive in several different ways and these ways are known as RAID levels. The different levels have different benefits including redundancy and increased efficiency. I am going to virtualize 3 common RAID levels, RAID0, RAID1, and RAID5.

Prerequisites

  • So the first thing we are going to need is our virtual server. I am virtual server 5 and going to use virtual machine 19.
  • We are going to virtualize our disk drives so the first thing we want to do is log into the VM server. I am going to ssh root@vmserver05.student.lab
  • Once you are logged in you want to edit the /xen/conf/vm19.cfg file
  • Under the section titled Disk Devices you should see two entries one for disk.img and one for swap.img. We are going to add our virtual disks here. Since for one of the RAID levels we are going to need at least 4 disks we will just add four right now.(Actually only needed 3 but having an extra is always a good idea in the real world)
  • So add the following lines:
    • 'file:/xen/domains/vm19/disk3.img,xvda3,w',
    • 'file:/xen/domains/vm19/disk4.img,xvda4,w',
    • 'file:/xen/domains/vm19/disk5.img,xvda5,w',
    • 'file:/xen/domains/vm19/disk6.img,xvda6,w',
  • Save your changes and exit
  • Now you want to change the directory to /xen/domains/vm19 (remember this will be different depending on what vm you are using)
  • Once you are in the right place you can do an ls and you should see two files: disk.img and swap.img. We now need to add new files for the virtual disks we just created.
  • For this project I want to make each virtual disk one GIG each
  • The next steps are going to take a moment. Using the dd command you are going to create 4 files that are 1024 Megabytes and writing all 0's to them. (You are creating 4 empty virtual disks that can store up to a gig of data). Do the following:
    • dd if=/dev/zero of=disk3.img bs=1M count=1024
    • dd if=/dev/zero of=disk4.img bs=1M count=1024
    • dd if=/dev/zero of=disk5.img bs=1M count=1024
    • dd if=/dev/zero of=disk6.img bs=1M count=1024
  • Now that we have our virtual disks we now need a utility to manage our RAID devices.
  • Keep in mind though that you want this utility on the virtual machine, not the server so make sure you startup your VM and then log into that.
  • Once you are on your virtual machine you should do this:
  • aptitude update
  • aptitude upgrade
  • aptitude search mdadm
  • (assuming search found the mdadm package) aptitude install mdadm(or whatever the package name was when you did the search)
  • Once all of this is done you should restart your VM and log back in
  • So all prerequisites are done, we have virtualized 4 disks for our RAID project and have the RAID utility installed so lets start out with RAID 0
  • As a side note, all of this will work even when not virtualized, you wild simply be using the device name instead of the file name and would not

RAID 0

  • Make sure you have rebooted your VM and are logged back in, you do NOT do this from the server.
  • RAID 0 needs at least two disks, it has no redundancy but increases performance because the blocks are striped. That means data is evenly distributed between each disk. This allows for data to theoretically be stored twice as fast. (check the resource links to see a diagram of this)
  • To create RAID 0 array, format and mount do this:
    • mdadm –create /dev/md0 –level=0 –raid-disks=2 /dev/xvda3 /dev/xvda4 (we only need 2 disks for RAID 0)
    • mkfs.ext3 /dev/md0
    • mount /dev/md0 /mnt
  • When doing this it can take a while for the entire process to complete, you can check the process by doing the following:
    • cat /proc/mdstat
  • Once everything is complete you will have created your first RAID array, now being that this is a level 0 array all you have done is taken two virtual drives that are each one gig and virtually combined them to access them as a single 2 gig drive.
  • Two ways to test and play with it is to do a “df -h” and if you look at the /dev/md0 you should see it as approximately 2 GIGS
  • You can also try to copy a file larger than 1 gig to the drive. I personally used DD to create a 1.5 gig file and it worked!
  • So in conclusion for RAID 0; it allows for multiple disks to be tied together as one logical unit and allows for faster writing of data since twice(or more) devices can be written to at a time.

RAID 1

  • RAID 1 also requires a minimum of 2 disks but does have redundancy and does not increase performance as it does not stripe. Everything that is written to the first disk is also written to any other disk in the array. This does not increase your overall drive size, it just gives you backup.
  • Before we create a new ARRAY we need to deactivate the last one using: mdadm –misc -S /dev/md0
  • To create a RAID 1 array do the following:
    • mdadm –create /dev/md0 –level=1 –raid-disks=2 /dev/xvda3 /dev/xvda4
    • mkfs.ext3 /dev/md0
    • mount /dev/md0 /mnt
  • When doing this it can take a while for the entire process to complete, you can check the process by doing the following:
    • cat /proc/mdstat
  • Once everything is complete you will have created level 1 array. You have taken two virtual drives that are each one gig and mirrored them.
  • Two ways to test and play with it is to do a “df -h” and if you look at the /dev/md0 you should see it as approximately 1 GIGS(There are two 2 gig drives but they are mirrored so will only show up as 1 gig drive.)
  • You can also write a couple of files to /mnt then overwrite one of the disk images used for our virtual disks. I used DD to overwrite disk3.img.
    • By doing this we have simulated a harddrive failure. Since this is virtualized reboot the VM and you will notice you cant mount your array and cant access the files you copied. Well what do we do? This is where RAID shows its power.
      • First we must tell mdadm that we have a faulty drive:
        • mdadm –manage –set-fault /dev/md0 /dev/xvda3
      • Now we simply add a new drive and it will automatically rebuild! (I used disk5.img)
        • mdadm –manage /dev/md0 -a /dev/xvda5
        • you can type: watch cat /proc/mdstat and watch the process as it rebuilds. It takes some time because all bits from the non damaged drive is copied to the replacement drive.
  • So in conclusion for RAID 1; it allows for multiple disks to be tied together and mirrored thus allowing for a complete restore if one of the disks in the array is damaged. Using mdadm it is as simple as telling it to replace the disk and the restore will happen.

RAID 5

  • RAID 5 needs at least 3 disks. It has striping for faster performance and parity for redundancy. Parity data is an additional digit of information that helps you recover lost data. This is the easiest way to get the benefits of performance and redundancy.
  • Before we create a new ARRAY we need to deactivate the last one using: mdadm –misc -S /dev/md0
  • To create a RAID 5 array do the following:
    • mdadm –create /dev/md0 –level=5 –raid-disks=3 /dev/xvda3 /dev/xvda4 /dev/xvda5
    • mkfs.ext3 /dev/md0
    • mount /dev/md0 /mnt
  • When doing this it can take a while for the entire process to complete, you can check the process by doing the following:
    • cat /proc/mdstat
  • You have now built a level 5 RAID array. if you do a df -h you will see that /dev/md0 is listed as a 2 gig drive. this is because 2 of the 3 virtual disks are striped thus doubling your space, with the 3rd drive being used for parity and allowing for future backup. To test and play with it you can simply do the same things you did for level 0 and 1.
  • So in conclusion RAID5 is an efficient way to get both performance and redundancy benefits with only 3 drives.

Conclusions

  • RAID is quite interesting and fun to play with. You can use it for faster data storage and for redundancy which is always a nice thing. I just want to give out an honorable mention on RAID Level 10 which is a combination of 0 and 1. You need 4 disks which are striped AND mirrored. It is a bit more efficient than level 5 but requires more disks. Check out the resources section for nice diagrams on these levels.
  • Little shout out to Matt as he helped me alot with this and in the process learned a thing or two as well.

Resources

user/kkrauss1/portfolio/raid.txt · Last modified: 2012/04/05 16:35 by kkrauss1