From: "Theodore Ts'o" To: Francesco Peeters Cc: ext3-users Subject: Re: Harddisk gone bad Date: Mon, 11 Nov 2002 16:43:20 -0500 On Sun, Nov 10, 2002 at 08:55:26PM +0100, Francesco Peeters wrote: > Hi all, > > I know this is the EXT3 list, and my problem is with an EXT2 > filesys, but I cannot seem to find a more suitable list on this > server, and I have seen a lot of knowledge go by on this list and in > the archives, so I thought I'd give it a try anyway... > > Here goes nothing: > > I am in a terrible problem: My data disk on my Linux server has gone > bad, with approx 18 GB of data on it, and I never got round to > installing a abckup system! :-( (I know: very stupid!) I never > noticed anything before, but I went on vacation, and after returning > I simply turned the box on again, and now I have this problem!!! > > It gave an error on a short read (attempt to read block from > filesystem resulted in short read while trying to open > /dev/hdc1. Could this be a zero-length partition?) and I ran e2fsck > -cc on it, which seems to have fixed that, however the following > inode sweep gives so many 'bad blocks in inode XXXXX', that I am > afraid that I'll be left with an empty disk once the check is > done... I suppose one of these days someone really should write a "hard disk catastrophe" HOWTO..... When you have a lot of precious data on a disk that hasn't been backed up. The very **first** thing you should do is to get the cursing yourself for being twenty different kinds of full for not having a backup system out of your system. Get that out of your system, so you don't make any further mistakes..... Next, get yourself a backup hard drive which is at least as big as the disk which is in trouble, and do a full disk-to-disk copy of the disk that's in trouble: dd if=/dev/hdc of=/dev/hdd bs=1k conv=sync,noerr Do this right away, because if the problem was due to hardware failure, you want to grab a snapshot before the disk gets any worse. For experimental purposes, if you're not sure what you're doing, it's useful to get another spare disk, and make a second-generation copy from your first primary backup. That way, you can experiment on the second-generation copy, and if one recovery technique doesn't work, you can try again with a different technique, and not have to worry about making any irrecoverable mistakes. The first thing I would try at this point, is an "e2fsck -y" on the second generation backup. See what you can save when it's all done; don't forget to check the lost+found directory in the root of the filesystem. Sometimes files will end up there. If that doesn't work, the next steps will require a lot more expertise and special work. So I'd start with that, and see how much you can recover from that. > Now when I try to do e2fsck /dev/hdc1 I get 'a corruption was found > in the superblock' When I try e2fsck -b 8193 /dev/hdc1 It claims it > is not a valid superblock... The same for for instance 32679, a.s.o. For a 4k filesystem, the backup block is 32768. But please, make the full disk-to-disk backups first, and experiment on the backups. That way, you don't need to worry about panic-induced mistakes from making the problem any worse. - Ted P.S. For those people for whom backups are just too much effort, *please* consider using the "e2image" program to snapshot and backup critical filesystem metadata. It's not a replacement for doing full data backups, but at least if you have an e2image dump, in the worst case you'll be able to recover more files if a disk failure damages your inode table. The problem without the inode table there is no record of which blocks go with which files, which means that recovering files because a very, very painful manual process. e2image will create a backup copy of the inode table, which even if it is not fully up-to-date, will be a help in trying to reconstruct data from a filesystem after a disk failure. Of course, the real answer is to do real backups..... _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users