On Tue, Nov 23, 2021 at 11:05:20AM +0000, Martin Wilck wrote: > Hi Ben, > > some more thoughts about the ro handling. > > On Mon, 2021-11-22 at 20:39 +0100, Martin Wilck wrote: > > On Mon, 2021-11-22 at 11:43 -0600, Benjamin Marzinski wrote: > > > On Mon, Nov 22, 2021 at 04:48:06PM +0000, Martin Wilck wrote: > > > > On Mon, 2021-11-22 at 09:35 -0600, Benjamin Marzinski wrote: > > > > > On Fri, Nov 19, 2021 at 09:33:39PM +0000, Martin Wilck wrote: > > > > > > On Thu, 2021-11-18 at 16:47 -0600, Benjamin Marzinski wrote: > > > > > > > > > > > > > > +static bool > > > > > > > +needs_ro_update(struct multipath *mpp, int ro) > > > > > > > +{ > > > > > > > + struct pathgroup * pgp; > > > > > > > + struct path * pp; > > > > > > > + unsigned int i, j; > > > > > > > + struct dm_info *dmi = NULL; > > > > > > > + > > > > > > > + if (!mpp || ro < 0) > > > > > > > + return false; > > > > > > > + dm_get_info(mpp->alias, &dmi); > > > > > > > > > > > > Why can't you just use mpp->dmi here? > > > > > > > > > > Since that value is set when the dmi is originally created, I > > > > > didn't > > > > > want to not reload a map, if we simply haven't updated it yet > > > > > to > > > > > reflect > > > > > a change in the read-only value, like with do with > > > > > dm_is_suspended() > > > > > or dm_get_deferred_remove(), etc. I could make a > > > > > dm_get_read_only() > > > > > function and put it libmultipath/devmapper.c like the others, > > > > > if > > > > > you'd > > > > > rather. > > > > > > > > I had expected that this property wouldn't silently change under > > > > us. > > > > Actually, I do think that we should get an uevent if this > > > > happens. > > > > Not sure if we process it properly, though. > > > > > > I think we will update the dmi, > > This would need to be done on a change uevent for the dm device in > ev_add_map(), but AFAICS we don't. ev_add_map() is basically a noop if > the map is already known, unless wait_for_udev is 2. > > > > but I'm not sure that these uevents > > > won't race. The worry was that the device would switch to read-only > > > and > > > then back too quickly, and we would get this event and still see > > > the > > > device in read/write because we haven't processed the event which > > > would > > > update the multipath dmi. > > > > OK. I'm fine with the patch, perhaps explain these subtleties some > > more > > in the commit message for future reference. > > I've never looked into the ro attribute processing closely. I just did. > I'm unsure how a race would come to pass, in particular with your patch > applied: > > 1. path change uevent arrives > 2. ro attribute of path device has changed > 3. map reload occurs if > a) map was rw before (thus all paths, too) and the path changed to > ro > b) map was ro before and all paths have changed to rw > 4. kernel will call set_disk_ro() depending on the DM_READONLY_FLAG; > set_disk_ro() triggers an uevent for the block device if and only > if the ro flag changed > 5. we set mpp->dmi in __setup_multipath(). > > We hold the vecs lock between 3 and 5, so even if the uevent arrived > before setup_multipath() was called, I don't see how it could race. > mpp->dmi as derived in 5 should reflect the state correctly. I admit, I didn't find a definitive race. I was just worried about the possibility of the dmi being outdated. While there's always the possibility of the multipath device's RO state getting changed outside of multipathd (by a multipath call for example), this is not the place to deal with that. ev_add_map() would be. After looking at this, I'm o.k. with trusting the existing dmi, especially if we are updating it in ev_add_map(). > What we could do is remember the ro-state of the map in dm_addmap(), > e.g. in a mpp->ro field. If map creation with ro=0 succeeded, we can be > pretty certain that the map is in read-write state. Otherwise we'd > fallback to ro=1, and remember that state, too. We could verify that > state once more against the dmi info in setup_multipath(). By doing > that we'd cover the time span between the dm ioctl and retrieving the > dmi in setup_multipath(). That would IMHO be more consistent than the > current use of the temporary force_readonly flag. So the idea would be to never try reloading read-write when the map is marked as RO, until we get a path event updating the RO state? I do worry about cases where well fail to reload the map correctly then. Imagine that we have a map with mpp->ro=1 with one read-only path. The read-only path gets removed. If we just assume that the mpp->ro state is correct until with get a path_event changing the read-only state, we will won't reload read/write here. The other option would be to check the path's RO state every time we reload, or at least whenever we're reloading to remove a path. That has the advantage that it doesn't produce a dm error message like a failed reload does, but I'm not sure if it's any less work. Or am I misunderstanding what you are suggesting here? > I've been wondering whether we should use your logic during map > creation, too, and not even try to setup the map with ro=0 if we have > paths in read-only state. If sysfs says one of the paths is read-only, it seems reasonable to skip the read/write reload. -Ben > Regards > Martin > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel