APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Rebuilding failed Linux software RAID

Some material is very old and may be incorrect today

© October 2004 Tony Lawrence
October 2004

Recently I had a hard drive fail. It was part of a Linux software RAID 1 (mirrored drives), so we lost no data, and just needed to replace hardware. However, the raid does requires rebuilding. A hardware array would usually automatically rebuild upon drive replacement, but this needed some help.

When you look at a "normal" array, you see something like this:

# cat /proc/mdstat
Personalities : [raid1] 
read_ahead 1024 sectors
md2 : active raid1 hda3[1] hdb3[0]
      262016 blocks [2/2] [UU]
md1 : active raid1 hda2[1] hdb2[0]
      119684160 blocks [2/2] [UU]
md0 : active raid1 hda1[1] hdb1[0]
      102208 blocks [2/2] [UU]
unused devices: <none>

That's the normal state - what you want it to look like. When a drive has failed and been replaced, it looks like this:

Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1  hda1[1]
      102208 blocks [2/1] [_U]

md2 : active raid1 hda3[1]
      262016 blocks [2/1] [_U]

md1 : active raid1 hda2[1]
      119684160 blocks [2/1] [_U]
unused devices: <none>

Notice that it doesn't list the failed drive parts, and that an underscore appears beside each U. This shows that only one drive is active in these arrays - we have no mirror.

Another command that will show us the state of the raid drives is "mdadm"

# mdadm -D /dev/md0
        Version : 00.90.00
  Creation Time : Thu Aug 21 12:22:43 2003
     Raid Level : raid1
     Array Size : 102208 (99.81 MiB 104.66 MB)
    Device Size : 102208 (99.81 MiB 104.66 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri Oct 15 06:25:45 2004
          State : dirty, no-errors
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

    Number   Major   Minor   RaidDevice State
       0       0        0        0      faulty removed
       1       3        1        1      active sync   /dev/hda1
           UUID : f9401842:995dc86c:b4102b57:f2996278

As this shows, we presently only have one drive in the array.

Although I already knew that /dev/hdb was the other part of the raid array, you can look at /etc/raidtab to see how the raid was defined:

raiddev             /dev/md1
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda2
    raid-disk     0
    device          /dev/hdb2
    raid-disk     1
raiddev             /dev/md0
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda1
    raid-disk     0
    device          /dev/hdb1
    raid-disk     1
raiddev             /dev/md2
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda3
    raid-disk     0
    device          /dev/hdb3
    raid-disk     1

To get the mirrored drives working properly again, we need to run fdisk to see what partitions are on the working drive:

# fdisk /dev/hda

Command (m for help): p

Disk /dev/hda: 255 heads, 63 sectors, 14946 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hda1   *         1        13    104391   fd  Linux raid autodetect
/dev/hda2            14     14913 119684250   fd  Linux raid autodetect
/dev/hda3         14914     14946    265072+  fd  Linux raid autodetect

Duplicate that on /dev/hdb. Use "n" to create the parttions, and "t" to change their type to "fd" to match. Once this is done, use "raidhotadd":

# raidhotadd /dev/md0 /dev/hdb1
# raidhotadd /dev/md1 /dev/hdb2
# raidhotadd /dev/md2 /dev/hdb3

The rebuilding can be seen in /proc/mdstat:

# cat /proc/mdstat
Personalities : [raid1] 
read_ahead 1024 sectors
md0 : active raid1 hdb1[0] hda1[1]
      102208 blocks [2/2] [UU]
md2 : active raid1 hda3[1]
      262016 blocks [2/1] [_U]
md1 : active raid1 hdb2[2] hda2[1]
      119684160 blocks [2/1] [_U]
      [>....................]  recovery =  0.2% (250108/119684160) finish=198.8min speed=10004K/sec
unused devices: <none>

The md0, a small array, has already completed rebuilding (UU), while md1 has only begun. After it finishes, it will show:

#  mdadm -D /dev/md1
        Version : 00.90.00
  Creation Time : Thu Aug 21 12:21:21 2003
     Raid Level : raid1
     Array Size : 119684160 (114.13 GiB 122.55 GB)
    Device Size : 119684160 (114.13 GiB 122.55 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Oct 15 13:19:11 2004
          State : dirty, no-errors
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

    Number   Major   Minor   RaidDevice State
       0       3       66        0      active sync   /dev/hdb2
       1       3        2        1      active sync   /dev/hda2
           UUID : ede70f08:0fdf752d:b408d85a:ada8922b

I was a little surprised that this process wasn't entirely automatic. There's no reason it couldn't be. This is an older Linux install; I don't know if more modern versions will just automatically rebuild.

If you found something useful today, please consider a small donation.

Got something to add? Send me email.

(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> Rebuilding failed Linux software RAID


Inexpensive and informative Apple related e-books:

Take Control of IOS 11

iOS 10: A Take Control Crash Course

Take Control of Pages

Take Control of Parallels Desktop 12

El Capitan: A Take Control Crash Course

More Articles by © Tony Lawrence

---October 24, 2004

The tool already exists.

Look at the "sgraidmon" piece. If a mirror fails, it will automatically rebuild your partitions and resync the raid after a new disk is inserted.

---October 24, 2004

That rather specifically mentions SCSI - what about IDE RAID?


---October 24, 2004

I've got a setup with the first partition on each disk as swap. Setting them to the same priority, they will effectivly stream, while the rest of the partitions are mirrored.

That setup is very efficient, but it has two drawbacks:

If one disk dies, the machine will go down, since the streamed swap will suddenly be full of holes (a quick reboot would be all that would be needed to make the swap "whole" on one disk, so I don't consider that a big problem).

I would have to use a manual method of reentering a new disk anyway, since I do not think the makers of the raid-auto-tool (if it exist) would read fstab and raidtab, and set up the swap for me. No biggie - I can live with that.

If you want to know more about how I've set it up, balancing between highest security and speed; offering somewhat uptime in the process, you can read all about it on: http://nalle.no/newnalle.php

There you can also get the reason why you should not use 'dd' to copy the drives, and a short description of the troubles with booting from the RAID. The Linux RAID HOWTO is an extremly good help too - well written and easy to understand (and if you don't you'll get far only by following the recepie).


---October 25, 2004
I don't know whether this is useful, but I found that resyncing the RAID on my machine was very slow because it used only a fraction of the I/O bandwidth. By modifying the
values in /proc/sys/dev/raid/speed_limit_min and
/proc/sys/dev/raid/speed_limit_max the resync was much quicker. Since my machine is a single-user workstation the I/O hit on other applications on the machine was acceptable.

-- [email protected]

---October 25, 2004

What if I'm not so lucky as to have a linux raid, but have been left behind in the SGI IRIX 3rd party scsi raid land? Anyone have good tips?


---November 14, 2004

"raidhotadd" is not on FC3, however MDADM can do it. For adding drives back to the array, use:

mdadm [raid-array] -add [drive-to-add], e.g.,

mdadm /dev/md0 -add /dev/sdb1


---December 31, 2004

I removed hdb (100GB IDE drive) from my system and installed a 300GB IDE drive.

I followed the above steps to rebuild my IDE RAID1 mirror, but the first partition on hdb is not bootable, like it is on hda. Once I can get the partition to be bootable, I plan to expand the disk to 300GB and then install the other 300GB drive and let the system rebuild it.

How can I get the first partition to be bootable? I am using Mitel SME server built on RH7.3.

Thanks - TC

---December 31, 2004

You probably need to run grub to write boot tracks (I think that version uses grub, right?).

Questions should go to the Forum, not comments.


Fri Jun 24 14:58:48 2005: 702   TonyLawrence

Actuall, hda is slightly larger:

$ expr 255 \* 63 \* 1245
$ expr 19846 \* 63 \* 16

That's just not going to work unless part of a is not used..

Sun Aug 31 10:23:21 2008: 4518   anonymous

As a relatively new linux user I have found the least hassle way to recover your RAID 1 is using the GUI Webmin tool www.webmin.com. On my home server I disconnected my primary disk to test my RAID worked properly. Rebuilding it was a simple as selecting it on the Webmin management page and adding it back in. Monitoring of the rebuild was still performed by cat /proc/mdstat

Fri Feb 26 21:14:55 2010: 8149   malarie


I have a problem where it says /dev/md1 cant be mounted. This is my swap partition on my raid 1. I can safely boot with a live cd and mdadm --assemble --scan but md1 is not activated. It says the partition table are invalid..

Is there a way to fix this? (3rd line)

[email protected]:~$ sudo mdadm --assemble --scan
mdadm: /dev/md0 has been started with 2 drives.
mdadm: no devices found for /dev/md1

[email protected]:~$ sudo cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
234372160 blocks [2/2] [UU]

unused devices: <none>
[email protected]:~$ cat /etc/mdadm/mdadm.conf
# mdadm.conf
# Please refer to mdadm.conf(5) for information about this file.

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=743cc23b:d8a07722:52074cff:8a3b0676
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=c2b63389:53ec60d6:790b3fad:e5cacb62

# This file was auto-generated on Fri, 26 Feb 2010 18:53:25 +0000

Sat Feb 27 00:00:53 2010: 8150   TonyLawrence


Swap isn't a file system. It doesn't get mounted.

Mon Apr 5 14:13:21 2010: 8370   ravindran


ya it was not auto even in my case.. ubuntu 9.10.. i had to hot re-add the so called faulty volume to start rebuild proces...

Mon May 10 10:23:38 2010: 8571   Alex


I'm looking for some help recovering data off a disk that was in a raid1 array. The other disk has died and I'm told I need to use Linux to rebuild the array. I'm a complete noob with Linux though.

Can anyone point me in the direction of a forum that could help?

Mon May 10 11:09:33 2010: 8573   TonyLawrence


The instructions are right here.

If you can't follow that, I don't think a forum is going to help - the fact that you are asking how to find a forum indicates this all might be too far beyond you (hint -did you try to Google "linux forum" ?).

My suspicion is that you need paid assistance. There are hundreds of Linux consultants listed here under the Consultants link - I think you need someone.

Sat Nov 6 23:06:46 2010: 9103   TerryAntonio


Thanks so much for this. After successfully dodging raid for 20 years I thought I should "ave a go" with a MSI P45 neo2 board with inbuilt raid (fakeraid) and Centos 5.
All went as advertised in setting up but then as fate would have it within a couple of months one of the disks started to fail.
So I trotted down and brought myself a brand new 500Gb sata2 drive and wacked it in. The bios immediately dropped me into the rebuild shop, great, but then the problem started, I got this cryptic message "The OS will rebuild drive " so I restarted and nothing happened.
After a 3 days of googling I found this. Up to then I thought I was the only person in the universe who had not been told how to rebuild a raid that does not automatically rebuild or for that matter that a raid should auto rebuild using this motherboard.
So armed with this enlightenment I will ave another go and see if I can get it going again.
Thanks again

Mon Nov 8 02:37:57 2010: 9104   TerryAntonio


Back to the drawing board. No wonder I never tried raid for 20 years.
From what I can gleen after googling for another few days is I have LVM on raid1 since I had no idea what I was doing when I set this up I guess Centos did this for me. I also figured out that mdadm is for software raid and none of the commands work in my situation.
Looks like I have some serious reading ahead of me to figure out how all this works and since its vendor specific as far as the hardware raid is concerned there is virtually no information.
Seems to me this is abit useless as if you do get a hardware failure every couple of years you are a lot better restoring from backup then having to go back to school to learn how this all works.

Mon Nov 8 12:40:55 2010: 9105   TonyLawrence


When you start up the machine, you will have the opportunity to visit the raid BIOS. I think you have been there already. In there, you should be able to figure out what make/model you have and Google for specific help.

Wed Oct 26 13:21:28 2011: 10074   kuntergunt


I just used linux software raid for the first time on an 10.04 ubuntu server. The configuration was easy doing the expert mode install. The installation process asked whether booting from degraded disks should be enabled or not.
After the install being finished I made a shutdown and removed one disk. After that the array was degradad mode. Again shutdown, putting back the removed disk and booting up showed me that the array was rebuilding.

State : active, degraded, recovering

Very easy!


Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us

Printer Friendly Version

Much to the surprise of the builders of the first digital computers, programs written for them usually did not work. (Rodney Brooks)

Linux posts

Troubleshooting posts

This post tagged:






Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode