Re: [PATCH] multipathd: avoid unnecessary path read-only reloads

Martin Wilck <martin.wilck@xxxxxxxx> · Tue, 23 Nov 2021 20:03:33 +0000

On Tue, 2021-11-23 at 09:57 -0600, Benjamin Marzinski wrote:
> On Tue, Nov 23, 2021 at 11:05:20AM +0000, Martin Wilck wrote:
> > 
> > I've never looked into the ro attribute processing closely. I just
> > did.
> > I'm unsure how a race would come to pass, in particular with your
> > patch
> > applied:
> > 
> >  1. path change uevent arrives
> >  2. ro attribute of path device has changed
> >  3. map reload occurs if 
> >     a) map was rw before (thus all paths, too) and the path changed
> > to
> > ro
> >     b) map was ro before and all paths have changed to rw
> >  4. kernel will call set_disk_ro() depending on the
> > DM_READONLY_FLAG;
> >     set_disk_ro() triggers an uevent for the block device if and
> > only
> >     if the ro flag changed
> >  5. we set mpp->dmi in __setup_multipath().
> > 
> > We hold the vecs lock between 3 and 5, so even if the uevent
> > arrived
> > before setup_multipath() was called, I don't see how it could race.
> > mpp->dmi as derived in 5 should reflect the state correctly.
> 
> I admit, I didn't find a definitive race. I was just worried about
> the
> possibility of the dmi being outdated. While there's always the
> possibility of the multipath device's RO state getting changed
> outside
> of multipathd (by a multipath call for example), this is not the
> place
> to deal with that. ev_add_map() would be. After looking at this, I'm
> o.k. with trusting the existing dmi, especially if we are updating it
> in
> ev_add_map().
>  
> > What we could do is remember the ro-state of the map in
> > dm_addmap(),
> > e.g. in a mpp->ro field. If map creation with ro=0 succeeded, we
> > can be
> > pretty certain that the map is in read-write state. Otherwise we'd
> > fallback to ro=1, and remember that state, too. We could verify
> > that
> > state once more against the dmi info in setup_multipath(). By doing
> > that we'd cover the time span between the dm ioctl and retrieving
> > the
> > dmi in setup_multipath(). That would IMHO be more consistent than
> > the
> > current use of the temporary force_readonly flag.
> 
> So the idea would be to never try reloading read-write when the map
> is
> marked as RO, until we get a path event updating the RO state? I do
> worry about cases where well fail to reload the map correctly then.
> Imagine that we have a map with mpp->ro=1 with one read-only path.
> The
> read-only path gets removed. If we just assume that the mpp->ro state
> is
> correct until with get a path_event changing the read-only state, we
> will won't reload read/write here.

I didn't think about this possibility. TBH, it sounds pretty
pathological to me.

>   The other option would be to check
> the path's RO state every time we reload, or at least whenever we're
> reloading to remove a path. That has the advantage that it doesn't
> produce a dm error message like a failed reload does, but I'm not
> sure
> if it's any less work. Or am I misunderstanding what you are
> suggesting
> here?

Hm, I guess not. I'm worried that that may slow down reload operations
quite a bit. Under normal circumstances, all paths will be r/w most of
the time (in my experience, read-only multipath storage is pretty
rare). The idea to re-check this only on path removal sounds ok-ish. We
could also track the ro attribute of paths, so that we don't have to
check sysfs every time....

But before things get overly complicated, trying to load r/w first and
checking -EROFS like we do now, and doing the sysfs check when we
receive uevents, is just as good, if not better.

> > I've been wondering whether we should use your logic during map
> > creation, too, and not even try to setup the map with ro=0 if we
> > have
> > paths in read-only state.
> 
> If sysfs says one of the paths is read-only, it seems reasonable to
> skip
> the read/write reload.

The only problem being that we'll have to check all paths in almost
every case.

Martin

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel