Previous Entry Share Next Entry
01:43 pm, 10 Feb 06

disk recovery saga cont'd

Thanks for the hint to try dd_rescue! It turns out that only about a fraction of a percentage point of the disk is bad. Depending on how you look at it, this is either very little (small percentage) or larger than I hoped (it's over a megabyte, though not much). So now I have a dd image that's mostly good. The first error occurs around 2gb in, which is hopefully far after any important filesystem data structures.

But here's where it gets extra-complicated. The disk was part of a FreeBSD vinum mirror, and vinum stores some junk at the beginning of the disk (looking at the image with less, I see some vinum commands). So how do I mount it?
mount -o ro,loop -t ufs driveimage /mnt produces ufs_read_super: bad magic number

Brad had earlier suggested just trying every offset within the first few megs, so I wrote a script to loop over mount commands, watched it run for a few minutes, and then left and came back to a locked computer.

So more recently I dug up the UFS superblock magic number, then wrote a program that scanned the first 10mb or so of the disk for the magic number; then, from those occurrences, I eliminated the obviously-wrong ones (by looking at other fields in the UFS superblock structure, like block size) and found that the offset is around 200k. But feeding the superblock offset to the loopback offset (you can pass offset=n to the mount options) didn't work. However, I started the script again near that offset and the mount now causes a kernel panic almost immediately, while starting a few k past that offset cases it to continue to fail as before.

This makes me think that I found the right place! But it's no good to me if the kernel panics.
UFS-fs error (device loop0): ufs_read_inode: inode 2 has zero nlink
init_special_inode: bogus i_mode

So from here, I'm considering:
  • Dig up some sort of FreeBSD rescue CDROM and see if I can get it to do something more intelligent. (Though as I recall, back when this mess first happened the symptom was for the FreeBSD kernel to panic after I had read a bit of data.)
  • Dig into UFS (there's a "libufs" that just has the good bits) and try to write a program that's more cautious about the data it pulls.
Any thoughts / other hints?