udev: timeout on WAIT_FOR_SYSFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I'm having some problems with a Red Hat 5 based machine that still
uses the udev code before that patch. I'm having the problem that some
times the devices do not all show up properly by the time they are
needed.

The devices are on a SAN, they are behind a QLA2xxx HBA and on an
Hitachi storage.

Occasionally (not always), I'm having a message such as this one during reboot:
udevd-event[24448]: wait_for_sysfs: waiting for
'/sys/devices/pci0000:00/0000:00:06.0/0000:0d:00.1/host1/rport-1:0-0/target1:0:0/1:0:0:4/ioerr_cnt'
failed

The target and LUN change sometimes but it's basically the same
message. Soon after that, I have initialization of Veritas VxVM but
when I get this message it fails as not all devices are available.
Other times the boot process completes as usual and all devices are
present. In a previous e-mail to Kay Sievers he suggested I might be
missing a KERNEL=="[0-9]*:[0-9]*" filter for the SCSI udev rule, but I
don't think that's the case as the issue is intermittent and happens
only once every "x" reboots.

I tried adding a call to "udevsettle" just before initializing VxVM
but it doesn't help, whenever I have the problem that triggers that
log message the devices do not show up even after udevsettle. I also
tried "udevtrigger" but that didn't help either.

I saw this patch that looked interesting, it's from almost 4 years ago
but it resembles the codebase of RHEL5 that I'm running:
http://git.kernel.org/?p=linux/hotplug/udev.git;a=commit;h=39ea7c6c67de69379b603196a0eff6f7ce2e469a

I'm pretty much considering applying that patch to udevd since it will
probably fix it, but as I can't reproduce the problem reliably I
wanted to ask some questions just to have more confidence in going on
with that fix.

For instance, the log message is somewhat vague saying some SCSI disks
take 6.5s to populate sysfs, does someone have some details of which
kinds of disks cause that? Is this related to SAN disks? If this was
experienced with qla2xxx driver and/or Hitachi SAN even better as that
confirms the issue I'm having...

Also, can someone tell what would cause a device to take long to
populate sysfs? Is this related to the load (as in "Load Average") of
the machine at the time the module is loaded? Could that be related to
some heavy scripts being called from udev rules?

And could someone please give me some idea of between which events the
timeout is? Is it from the point udev gets the event from a queue
until the device shows up in sysfs? Does that depend on the driver (in
this case qla2xxx) or the device itself? What can affect that timing?

Thank you very much in advance!

Filipe
--
To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel]     [Linux DVB]     [Asterisk Internet PBX]     [DCCP]     [Netdev]     [X.org]     [Util Linux NG]     [Fedora Women]     [ALSA Devel]     [Linux USB]

  Powered by Linux