Re: Persistent failures with simple md setup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Donnerstag, 21. März 2013 14:24:59 NeilBrown wrote:
> On Mon, 18 Mar 2013 12:20:57 +0100 Hans-Peter Jansen <hpj@xxxxxxxxx> wrote:
> > Am Freitag, 15. März 2013, 23:43:31 schrieb Hans-Peter Jansen:
> > > Am Mittwoch, 13. März 2013, 11:52:35 schrieben Sie:
> > > > hi,
> > > > 
> > > >  wonder if you could try one more test for me.
> > > >  With the other (echo BEFORE / echo AFTER etc) tracing still there,
> > > >  change
> > > > 
> > > > the /sbin/udevadm settle --timeout=$MDADM_DEVICE_TIMEOUT
> > > > 
> > > >  to
> > > >  
> > > >    /sbin/udevadm --debug settle --timeout=$MDADM_DEVICE_TIMEOUT >
> > > >    /dev/kmsg
> > > > 
> > > > 2>&1
> > > > 
> > > >  I found there is a case where "udevadm settle" can exit before the
> > > >  queue
> > > >  is
> > > > 
> > > > empty.  It seems like a very unlikely scenario, but it seems clear
> > > > that
> > > > something "unlikely" is happening.
> > > > 
> > > >  I'm hoping to see
> > > >  
> > > >     timeout waiting for udev queue
> > > >  
> > > >  appear in the logs when this runs.
> > > >  
> > > >  If you could add that and post the 'dmesg' output if you ever get
> > > >  that
> > > >  message - or maybe even if you don't - that would be very helpful.
> > > 
> > > Did that right now and disabled the sleep 10 in front, that
> > > provisionally
> > > "solved" this issue BTW. Now waiting for another md issue to occur.
> > 
> > Here we go: (not censored in any way, and I hope to get the "window"
> > right:
> > md0 was affected this time.
> 
> Thanks! Unfortunately it is not as helpful as I hoped, but it does suggest
> that "udevadm settle" does sometimes appear to misbehave even if there
> aren't any problems with the md array.
> 
> Could I ask for one more?
> Prefix the "udevadm settle " command with
>   strace  -o /tmp/udevadm.trace -s 500
> 
> (making sure that you have 'strace' installed) and then post the
> "/tmp/udevadm.trace" file.  Hopefully that will at least allow me to rule
> out some possibilities.

After reducing the timeout to 40 secs, we're able to catch a failure today
with two degraded arrays: md0 and md1.

Cheers,
Pete

Attachment: udevadm.trace.gz
Description: GNU Zip compressed data


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux