On 12/18/2014 11:04 PM, Benjamin Marzinski wrote: > On Wed, Dec 17, 2014 at 01:04:54PM +0100, Hannes Reinecke wrote: >> On 12/16/2014 11:18 PM, Benjamin Marzinski wrote: >>> On Tue, Dec 16, 2014 at 04:10:44PM -0600, Benjamin Marzinski wrote: >>>> On Mon, Dec 15, 2014 at 10:31:44AM +0100, Hannes Reinecke wrote: >> [ .. ] >>>>> So during bootup it's anyone's guess who's first, multipath or udev. >>>>> And depending on the timing either multipath will fail to setup >>>>> the device-mapper device or udev will simply ignore the device. >>>>> Neither of those is a good, but the first is an absolute killer for >>>>> modern systems which _rely_ on udev to configure devices. >>>>> >>>>> So how it this supposed to work? >>>>> Why does udev ignore the entire event if it can't get the lock? >>>>> Shouldn't it rather be retried? >>>>> What is the supposed recovery here? >>>> >>>> Hannes, are you against the idea that Alexander mentioned in his first >>>> email, of just locking a file in /var/lock? Multipathd doesn't create >>>> devices in parallel. Multipath doesn't create files in parallel. We are >>>> explicitly trying to avoid multipath and multipathd creating files at >>>> the same time. So, we should only need a single file to lock, and >>>> /run/lock should always be there. >>> >>> O.k. So if we want to keep our current nonblocking behavior, we'll need >>> more lockfiles, either one per path or one per wwid. This still seems >>> like a reasonable idea, if there is a good reason for systemd doing what >>> it's doing. >>> >> The problem is as follows: >> >> When multipathd is running we simply _cannot_ guarantee that no udev >> events are currently running. This currently hits us especially bad >> during system startup when device probing is still running during >> multipathd startup. >> Multipathd will then enumerate all block devices to setup the >> initial topology. >> But in doing so it might trip over device which are still processed >> by udev (or, worse still, _not yet_ processed by udev). >> (Yes, I know, libudev_enumerate should protect against this. >> But it doesn't. ) > > But we start waiting for events before the initial multipath device > configuration, and don't process them until after that configuration > is compelete, so if there is ever a case where the initial configuration > is accessing the device to early, aren't we guaranteed to get an event > afterwards, assuming that udev doesn't drop it? > That was the initial idea. Only it doesn't do it currently :-) >> >> So it's anyone guess what'll happen now; either multipath trips over >> the lock from udev when calling 'lock_multipath' (and consequently >> failing to setup the multipath device), or udev >> tripping over the lock from multipath and ignoring the event, >> leaving us with a non-functioning device. > > But my point above is that if we use a lockfile instead of locking the > path device itself, there won't be any lock contention, and udev won't > drop the events. > The underlying issue here is: Why does multipath lock the devices _at all_? If it were to protect against device disappearing while doing the ioctl that's already proven not to work. And for protecting against mounts a simple open(O_EXCL) would be sufficient. So whom are we fooling here? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@xxxxxxx +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel