(OLDER) <- More Stuff -> (NEWER) (NEWEST)
Kerio Reseller
Printer Friendly Version

Rebuilding failed Linux software RAID


Recently I had a hard drive fail. It was part of a Linux software RAID 1 (mirrored drives), so we lost no data, and just needed to replace hardware. However, the raid does requires rebuilding. A hardware array would usually automatically rebuild upon drive replacement, but this needed some help.

When you look at a "normal" array, you see something like this:


Hate these ads?



# cat /proc/mdstat
Personalities : [raid1] 
read_ahead 1024 sectors
md2 : active raid1 hda3[1] hdb3[0]
      262016 blocks [2/2] [UU]


      
md1 : active raid1 hda2[1] hdb2[0]
      119684160 blocks [2/2] [UU]


      
md0 : active raid1 hda1[1] hdb1[0]
      102208 blocks [2/2] [UU]







      
unused devices: <none>


That's the normal state - what you want it to look like. When a drive has failed and been replaced, it looks like this:



Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1  hda1[1]
      102208 blocks [2/1] [_U]



md2 : active raid1 hda3[1]
      262016 blocks [2/1] [_U]



md1 : active raid1 hda2[1]
      119684160 blocks [2/1] [_U]
unused devices: <none>


Notice that it doesn't list the failed drive parts, and that an underscore appears beside each U. This shows that only one drive is active in these arrays - we have no mirror.

Another command that will show us the state of the raid drives is "mdadm"



# mdadm -D /dev/md0
/dev/md0:
        Version : 00.90.00
  Creation Time : Thu Aug 21 12:22:43 2003
     Raid Level : raid1
     Array Size : 102208 (99.81 MiB 104.66 MB)
    Device Size : 102208 (99.81 MiB 104.66 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent



    Update Time : Fri Oct 15 06:25:45 2004
          State : dirty, no-errors
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0




    Number   Major   Minor   RaidDevice State
       0       0        0        0      faulty removed
       1       3        1        1      active sync   /dev/hda1
           UUID : f9401842:995dc86c:b4102b57:f2996278


As this shows, we presently only have one drive in the array.

Although I already knew that /dev/hdb was the other part of the raid array, you can look at /etc/raidtab to see how the raid was defined:



raiddev             /dev/md1
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda2
    raid-disk     0
    device          /dev/hdb2
    raid-disk     1
raiddev             /dev/md0
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda1
    raid-disk     0
    device          /dev/hdb1
    raid-disk     1
raiddev             /dev/md2
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda3
    raid-disk     0
    device          /dev/hdb3
    raid-disk     1


To get the mirrored drives working properly again, we need to run fdisk to see what partitions are on the working drive:



# fdisk /dev/hda



Command (m for help): p



Disk /dev/hda: 255 heads, 63 sectors, 14946 cylinders
Units = cylinders of 16065 * 512 bytes



   Device Boot    Start       End    Blocks   Id  System
/dev/hda1   *         1        13    104391   fd  Linux raid autodetect
/dev/hda2            14     14913 119684250   fd  Linux raid autodetect
/dev/hda3         14914     14946    265072+  fd  Linux raid autodetect


Duplicate that on /dev/hdb. Use "n" to create the parttions, and "t" to change their type to "fd" to match. Once this is done, use "raidhotadd":



# raidhotadd /dev/md0 /dev/hdb1
# raidhotadd /dev/md1 /dev/hdb2
# raidhotadd /dev/md2 /dev/hdb3


The rebuilding can be seen in /proc/mdstat:



# cat /proc/mdstat
Personalities : [raid1] 
read_ahead 1024 sectors
md0 : active raid1 hdb1[0] hda1[1]
      102208 blocks [2/2] [UU]


      
md2 : active raid1 hda3[1]
      262016 blocks [2/1] [_U]


      
md1 : active raid1 hdb2[2] hda2[1]
      119684160 blocks [2/1] [_U]
      [>....................]  recovery =  0.2% (250108/119684160) finish=198.8min speed=10004K/sec
unused devices: <none>


The md0, a small array, has already completed rebuilding (UU), while md1 has only begun. After it finishes, it will show:



#  mdadm -D /dev/md1
/dev/md1:
        Version : 00.90.00
  Creation Time : Thu Aug 21 12:21:21 2003
     Raid Level : raid1
     Array Size : 119684160 (114.13 GiB 122.55 GB)
    Device Size : 119684160 (114.13 GiB 122.55 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent



    Update Time : Fri Oct 15 13:19:11 2004
          State : dirty, no-errors
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0




    Number   Major   Minor   RaidDevice State
       0       3       66        0      active sync   /dev/hdb2
       1       3        2        1      active sync   /dev/hda2
           UUID : ede70f08:0fdf752d:b408d85a:ada8922b


I was a little surprised that this process wasn't entirely automatic. There's no reason it couldn't be. This is an older Linux install; I don't know if more modern versions will just automatically rebuild.



Comments /Linux/rebuildraid.html


Fri Jun 24 14:58:48 2005: Subject:   TonyLawrence
Actuall, hda is slightly larger:



$ expr 255 \* 63 \* 1245
20000925
$ expr 19846 \* 63 \* 16
20004768
$

That's just not going to work unless part of a is not used..

Add your comments

Enter your email address for automatic notification of new posts here
(be sure to whitelist 'feedburner.com' if you use spam filtering)

Or use any RSS reader

Delivered by FeedBurner


M3IP inc.

Views for this page
Today This Week This Month This Year  Overall
16168477,987 49,742

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Publishing your articles here

pavatar.jpg
More:
       - Disks/Filesystems
       - Hardware
       - Kernel/Internals




Unix/Linux Consultants

Your ad here - $24.00 yearly!

http://www.breakthru.com.au SCO (Openserver and Unixware), Unix, Solaris and Linux Consulting services including: Secure Networking Solutions; Linux based Firewalls; Backup Solutions; Secure Home to Office Network Setup; Phone, Remote and On-Site Support available - Satisfaction Guaranteed!


http://www.cleverminds.net Need expert advice? Want a second opinion? CleverMinds is a one-stop-shop for a wide range of technology solutions. We support Unix, Linux, SCO as well as CMS, ecom, blogs, podcasts, search engines consulting and more. Contact us at web2.0@cleverminds.net 0r (617) 894-1282


http://bcstechnology.net Full service Linux & UNIX systems integrator; Windows to UNIX/Linux Client-Server Specialist; Secure E-Mail & Website Hosting; Thoroughbred Software Developer; Custom Industrial Automation; Hardware & Electronics Experts; In Business Since 1985.




card_image






Coming Attractions

My Favorites

Change Congress

Related Posts