Re: Persistent failures with simple md setup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 10 Apr 2013 15:44:25 +0200 Hans-Peter Jansen <hpj@xxxxxxxxx> wrote:

> On Donnerstag, 21. März 2013 14:24:59 NeilBrown wrote:
> > 
> > Thanks! Unfortunately it is not as helpful as I hoped, but it does suggest
> > that "udevadm settle" does sometimes appear to misbehave even if there
> > aren't any problems with the md array.
> > 
> > Could I ask for one more?
> > Prefix the "udevadm settle " command with
> >   strace  -o /tmp/udevadm.trace -s 500
> > 
> > (making sure that you have 'strace' installed) and then post the
> > "/tmp/udevadm.trace" file.  Hopefully that will at least allow me to rule
> > out some possibilities.
> 
> After reducing the timeout to 40 secs, we're able to catch a failure today
> with two degraded arrays: md0 and md1.
> 
> Cheers,
> Pete

Thanks.

There is a similar bug report open at
  https://bugzilla.novell.com/show_bug.cgi?id=793954

I received a generally similar trace there and have been looking at yours
and those and the code and not getting very far.

The  traces show the /run/udev/queue.bin getting bigger and bigger, then
shrinking down to '8':
 
 %zgrep 'fstat64(4,' /tmp/udevadm.trace.gz | sed 's/.*st_size=\([0-9]*\),.*/\1/'

This is expected.  Whenever the queue becomes empty, the file is reset
(normally 'add' and 'remove' records are simply appended).
It is supposed to also be shrunk when it is bigger than 4K and over half of
the file is wasted (due to add/remove pairs) but the calculation of wastage
is broken and I think this never happens.

Nevertheless there seems to be a correlation between the times when it fails
and the max size of the queue.bin file.  In the files from the buzilla, the
times that it works, the queue.bin file never gets big.

But that could just be an accident.

In any case it is clear that udevadm is working correctly.  Which seems to
suggest that maybe 'udev' is messing with the queue.bin file.  Either that or
I'm misunderstand some other details and am entirely on the wrong path.

But I've gone over the udev code again and again and I cannot fault it.

So I'm still stumped.

Thanks for your help though.
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux