Re: MD array keeps resyncing after rebooting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Martin,

First of all, thanks a lot for your detailed analysis.

On Fri, Aug 2, 2013 at 8:19 PM, Martin Wilck <mwilck@xxxxxxxx> wrote:
> On 08/02/2013 02:46 PM, Francis Moreau wrote:
>
>> Please tell me if you can find something suspicous.
>
> One thing that I can see is that your BIOS seems to use the same time
> stamp everywhere. It is clear from the dump that the BIOS has changed
> the timestamp in the VD GUID, too. The timestamp used in the
> "before-bios" data is 2013-08-02 08:21:10, the timestamp after is
> 2013-08-02 14:27:27. Wonder if that fits?

Probably. I booted the system at 8:21, let the resync finish and did
the dump after. So it seems to fit.

>
> The spec says that the VD GUID consists of the vendor ID ("LSI     " in
> your case), controller type (the controller's PCI ID,
> 8086:1d60:0000:0000), the 4-byte time stamp, and a "random number"
> (0000 1450). Strangely, I also have an LSI fake RAID and it uses the
> same random number. It even generated the same number through several
> RAID creations. Seems to be a truly strange random number generator :-)
>
> All in all, this makes your controller's RAID GUIDs very predictable.
> But they change whenever the timestamp changes. It also explains why
> this controller can't have more than a single array.
>
> But that's not what you wanted to know, right?

Well that was very instructive, thanks.

> Besides the time stamp,
> sequence number, and CRC32, there is actually no difference between the
> two dumps. The suspicious part is here, same in both dumps:
>
> 00000860  00 00 ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
>
> The first two bytes in thus line are the state and init state, and the
> meaning is "Optimal, consistent, *not initialized*".
> And this is before *and* after BIOS started. Suspiciously, this doesn't
> match what you wrote before:
>
>> I checked during the shudown process that the array is correctly
>> > stopped since at that point I got:
>> >
>> > # mdadm -E /dev/sda | egrep "state"
>> >         state[0] : Optimal, Consistent
>> >    init state[0] : Fully Initialised
>
> This would correspond to "00 02", and it's what we should see after
> initialization. On my system the BIOS sets "00 01" (Optimal, consistent,
> Quick Init in progress) when it first creates an array, because the BIOS
> doesn't do a full initialization. But "not initialized" is weird. The
> mdadm DDF code won't set this by itself, AFAIK. Please make sure again
> that the "before" data matches what mdadm/mdmon wrote just after
> stopping during shutdown.

I'm pretty sure that's the "before" dump was just after stopping the
array, this is how I proceed: during the shutdown, I stop the system
right after "mdadm -S", then I checked the state  of the array with
"mdadm -E", which was initialized for sure. Then I powered off the
machine, removed one disk and did the "before" dump from my laptop.
After that I booted the machine with the disk in place until grub menu
appeared. I then powered off the machine and did the same procedure as
previously to get "after" dump.

Perhaps some data were still in a "cache" and was not written to the
disk when the power-off/reboot did happen ?

Maybe one thing that worths to note is that the same disk array
configuration with the same system works fine on qemu.
-- 
Francis
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux