Tips on Hard Drive Problems
Some material is very old and may be incorrect today
© February 2011 Anthony Lawrence
Back in the 1980's, I could count on earning a few hundred dollars every month from hard disk failures and other problems (slow performance, lost files and so on). That may be a slight exaggeration, but crashing disk drives and performance complaints were much more common then than now.
Today, I can't even remember the last time I had to replace a disk drive; it may have been several years back. The MTBF (Mean Time Between Failures) for disks is now many years on average. Good engineering and many years of experience to draw from has made drive failure a very rare event.
Performance has also become much less of an issue - today's systems and the software that drives them are usually not performance bound. There are always exceptions, of course, and specialized applications may benefit from specific tuning, but for most of us, "out of the box" is just fine.
However, problems do still happen.
Grinding noise in computer
Are you hearing noises from your computer? It is probably NOT the disk. More often that comes from a cooling fan bearing. If you are a little technically savvy, you can pinpoint the source of odd noises by temporarily unplugging the drive or fans (after powering everything off, of course, and perhaps even unplugging from the wall for more safety).
Do you need to replace a noisy fan? You probably should, because overheating computers will run more slowly and heat can cause component failures. On the other hand I have seen very noisy fans run on for years and years.
System crashes frequently
Try shutting off completely. Additionally, unplug the system from the wall if it is not a laptop. Now go get a cup of hot coffee or tea. You don't have to drink it - just wait until it is cooled off and then plug everything back on and try again. The point is just to let the system cool off completely. Don't rush, let it cool.
If this works, your problem may be heat. That could be from a failed fan, some airflow obstruction or a buildup of dust. Unless you can correct the cause, it's just going to come back.
I had a customer lose three servers one very hot summer weekend because she shut off the air conditioning to save money. The servers were in a tiny room with a closed door, their offices were on the top floor of a building with a flat asphalt roof; the sun beat down and the temperature rose and rose..
For a very temporary fix, you can try setting up a floor fan to blow directly onto the computer, with or without its covers open. This can sometimes keep a system running while you plan a more permanent solution.
For laptops, you can buy cooling stands. I use this Antec cooler but there are many others to choose from.
Disk won't boot
Try the full power off and cool down for this also. If that doesn't work, next check your computer BIOS to determine if it is "seeing" the drive. Sometimes simply resetting the BIOS to defaults can fix problems like this, but of course that can also lose other necessary configuration changes.
If the drive will boot after cooling down, with luck it can stay running long enough for you to back up data. In the old days, I have put hard drives in a refrigerator for a few hours to cool them! Cranking up the A/C and pointing external fans at the machine might buy you a few critical minutes.
I'd check internal cabling and power connections before resetting the BIOS. This is much harder to do on a laptop, but not impossible if you have a little mechanical skill. If you are comfortable working with the covers off and have enough knowledge and experience to feel safe, you may be able to feel whether or not the hard drive has powered on. You can also judge if it feels unusually hot - extreme heat can indicate a failed drive.
It's possible that the power connector or data cable are bad. You can try swapping the power cable from something that you know has power (like your CD or DVD drive) and data cables are cheap and easily available. Watch out for bent pins!
For IDE drives (most common in desktop and home computers) the master/slave selection is important. If disconnecting everything else on the same cable makes it work, you probably have a conflict.
SATA drives have no jumper settings, though you may need to update your computer's BIOS if it can't see the installed drives.
For SCSI drives, the interrupt assigned to the controller might change if you did something as simple as adding a new internal device or switching to a USB mouse from an older PS/2 style. If the operating system driver doesn't know that the SCSI disk controller can change its interrupt, it can fail to boot. SCSI id conflicts can also cause failures, as can improper SCSI termination - both too much and too little! Termination issues are more apt to cause flaky performance than complete failure, but you do need to understand termination if you have SCSI drives.
A controller failure can look just like a drive failure. If you have another machine with the same configuration, swapping the drive to that is often a quick way to determine the real cause.
RAID failures can be simple (replace a failed drive and rebuild the raid) or difficult (a failing controller has scrambled the raid).
If a drive needs replacing, the rebuild may be automatic or may need to be initiated by you. As RAID can be hardware or software, your procedure will vary. Here is an example of rebuilding a Linux software RAID.
If performance is an issue, more ram (for buffer cache) is an easy solution. RAID systems are another way to increase drive throughput.
Monitoring Drive health
One reason hard drives last longer than they used to is because modern drives can automatically remap failing sectors to new, spare sectors. This usually happens automatically without your knowledge.
Many of today's hard drives support S.M.A.R.T (Self monitoring analysis and reporting technology). This can be very helpful to determine if your disk might be getting near to failing.
You can find software that will display this S.M.A.R.T information if your operating system doesn't already do that for you.
If the drive simply won't boot, it could be that only boot sectors are damaged or missing. Adding the drive as a secondary in a working machine might let you access it to copy important data. Remember master/slave settings or SCSI id settings!
If all else fails, data recovery firms can often do amazing work recovering most or all of your data. If you are using Linux or Windows, be sure the firm you choose has experience with those filesystems!
Hard drive problems are rare today. Often you will have replaced your computer long before the hard drive causes you any problem at all. Just keep these tips in mind in case it ever does happen to you.
The best protection you can make for your data is to have a good backup strategy. Don't neglect that!
Got something to add? Send me email.
(OLDER) <- More Stuff -> (NEWER) (NEWEST)
Printer Friendly Version
More Articles by Anthony Lawrence © 2011-03-22 Anthony Lawrence