Re: The mysterious case of the disappearing superblock ...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I would first look for the superblock magic Neil mentions.  Usually in
lost PV, FSes and other data volumes the issue is that something like
the partition start moved and the magic is now either outside the
given partition or not in the right location in the given partition.
So you may want to take one disk and scan a wide range to see if you
can find it.  If you find it on that disk, now you have an idea where
it may be on the others.

Since the one is sda4 is that the last partition and if it is not the
last are you missing any other partitions?

I have never seen a disk that disappeared for no reason.I have always
been able to find something pointing to what the human error was.  A
lot of being able to do that is the machines/teams I oversee have
weekly data collects similar to sosreport on the active kernel
tables/config files, so I can see that prior to reboot the partition
table was not where it is after boot.  And that is usually as simple
as fixing the partition table to match where it was and then all is
good.  Even without that you can look for the header magic and from
that tell where the partition table for that partition starts.  I
oversee a huge number of systems, with countless different hands of
various experience levels doing work on those 20k systems so I have
seen pretty much every variation of issue, and I have always been able
to find evidence of a root cause.

On Fri, Jan 21, 2022 at 5:13 AM NeilBrown <neilb@xxxxxxx> wrote:
>
> On Wed, 19 Jan 2022, anthony wrote:
> > You all know the story of how the cobbler's children are the worst shod,
> > I expect :-) Well, the superblock to my raid (containing /home, etc) has
> > disappeared, and I don't have a backup ... (well I do but it's now well
> > out of date).
> >
> > So, a new hard drive is on order, for backup ...
> >
> > Firstly, given that superblocks seem to disappear every now and then,
> > does anybody have any ideas for something that might help us track it
> > down? The 1.2 superblock is 4K into the device I believe? So if I copy
> > the first 8K ( dd if=/dev/sda4 of=sda4.img bs=4K count=2 ) of each
> > partition, that might help provide any clues as to what's happened to
> > it? What am I looking for? What is the superblock supposed to look like?
>
> Yes, 4K offset.  Yes, that dd command will get what you want it to.
> It hardly matters what the superblock should looks like, because it
> won't be there.  The thing you want to know is: what is there?
> i.e.  you see random bytes and need to guess what they mean, so you can
> guess where they came from.
> Best to post the "od -x" output and crowd-source.
>
> Are you sure the partition starts haven't changed? Was the array made of
> whole-devices or of partitions?
>
> If you want to find out if the superblock got moved, the maybe searching
> for the magic number is best.
> Look a the start of super1.c in mdadm.  The first 4 bytes of the
> superblock are 0xa92b4efc little-endian.  So: FC 4E 2B A9
> The next 4 bytes as 01 00 00 00 ( the major version)
> Then the feature map - possibly 0.  Then 4 zero bytes.
>
> If you see something that looks like that, it worth trying to point
> mdadm at it.  Create a loop device over the it with an appropriate
> offset, and ask mdadm --example to look at it.
>
>
> >
> > Secondly, once I've backed up my partitions, I obviously need to do
> > --create --assume-clean ... The only snag is, the array has been
> > rebuilt, so I doubt my data offset is the default. The history of the
> > array is simple. It's pretty new, so it will have been created with the
> > latest mdadm, and was originally a mirror of sda4 and sdb4.
> >
> > A new drive was added and the array upgraded to raid-5, and I BELIEVE
> > the order is sdc4, sda4, sdb1 - sdb1 being the new drive that was added.
> >
> > Am I safe to assume that sdc4 and sda4 will have the same data offset?
> > What is it likely to be? And seeing as it was the last added am I safe
> > to assume that sdb1 is the last drive, so all I have to do is see which
> > way round the other two should be?
>
> I would suggest creating some sparse files the same size as the device,
> create loop devices over them, and creating the array in the sequence
> you remember doing it - using "--assume-clean" to avoid rebuilds that
> would make those sparse files less sparse.
> Then look at the metadata written and assume it is will similar to
> that which was written to your array.
>
> NeilBrown
>
>
> >
> > At least the silver lining behind this, is that having been forced to
> > recover my own array, I'll understand it much better helping other
> > people recover theirs!
> >
> > Cheers,
> > Wol
> >
> >



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux