On Tue, 2021-11-23 at 09:57 -0600, Benjamin Marzinski wrote: > On Tue, Nov 23, 2021 at 11:05:20AM +0000, Martin Wilck wrote: > > > > I've never looked into the ro attribute processing closely. I just > > did. > > I'm unsure how a race would come to pass, in particular with your > > patch > > applied: > > > > 1. path change uevent arrives > > 2. ro attribute of path device has changed > > 3. map reload occurs if > > a) map was rw before (thus all paths, too) and the path changed > > to > > ro > > b) map was ro before and all paths have changed to rw > > 4. kernel will call set_disk_ro() depending on the > > DM_READONLY_FLAG; > > set_disk_ro() triggers an uevent for the block device if and > > only > > if the ro flag changed > > 5. we set mpp->dmi in __setup_multipath(). > > > > We hold the vecs lock between 3 and 5, so even if the uevent > > arrived > > before setup_multipath() was called, I don't see how it could race. > > mpp->dmi as derived in 5 should reflect the state correctly. > > I admit, I didn't find a definitive race. I was just worried about > the > possibility of the dmi being outdated. While there's always the > possibility of the multipath device's RO state getting changed > outside > of multipathd (by a multipath call for example), this is not the > place > to deal with that. ev_add_map() would be. After looking at this, I'm > o.k. with trusting the existing dmi, especially if we are updating it > in > ev_add_map(). > > > What we could do is remember the ro-state of the map in > > dm_addmap(), > > e.g. in a mpp->ro field. If map creation with ro=0 succeeded, we > > can be > > pretty certain that the map is in read-write state. Otherwise we'd > > fallback to ro=1, and remember that state, too. We could verify > > that > > state once more against the dmi info in setup_multipath(). By doing > > that we'd cover the time span between the dm ioctl and retrieving > > the > > dmi in setup_multipath(). That would IMHO be more consistent than > > the > > current use of the temporary force_readonly flag. > > So the idea would be to never try reloading read-write when the map > is > marked as RO, until we get a path event updating the RO state? I do > worry about cases where well fail to reload the map correctly then. > Imagine that we have a map with mpp->ro=1 with one read-only path. > The > read-only path gets removed. If we just assume that the mpp->ro state > is > correct until with get a path_event changing the read-only state, we > will won't reload read/write here. I didn't think about this possibility. TBH, it sounds pretty pathological to me. > The other option would be to check > the path's RO state every time we reload, or at least whenever we're > reloading to remove a path. That has the advantage that it doesn't > produce a dm error message like a failed reload does, but I'm not > sure > if it's any less work. Or am I misunderstanding what you are > suggesting > here? Hm, I guess not. I'm worried that that may slow down reload operations quite a bit. Under normal circumstances, all paths will be r/w most of the time (in my experience, read-only multipath storage is pretty rare). The idea to re-check this only on path removal sounds ok-ish. We could also track the ro attribute of paths, so that we don't have to check sysfs every time.... But before things get overly complicated, trying to load r/w first and checking -EROFS like we do now, and doing the sysfs check when we receive uevents, is just as good, if not better. > > I've been wondering whether we should use your logic during map > > creation, too, and not even try to setup the map with ro=0 if we > > have > > paths in read-only state. > > If sysfs says one of the paths is read-only, it seems reasonable to > skip > the read/write reload. The only problem being that we'll have to check all paths in almost every case. Martin -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel