On Tue, Aug 29, 2023 at 07:47:47AM +1000, Dave Chinner wrote: > On Mon, Aug 28, 2023 at 12:08:22PM -0700, Darrick J. Wong wrote: > > On Thu, Aug 24, 2023 at 09:04:17PM -0700, Darrick J. Wong wrote: > > > On Fri, Aug 25, 2023 at 11:07:56AM +1000, Dave Chinner wrote: > > > > On Thu, Aug 24, 2023 at 04:21:46PM -0700, Darrick J. Wong wrote: > > > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > > > > > Don't allow log recovery to proceed on a readonly mount if the primary > > > > > superblock advertises unknown rocompat bits. We used to allow this, but > > > > > due to a misunderstanding between Dave and Darrick back in 2016, we > > > > > cannot do that anymore. The XFS_SB_FEAT_RO_COMPAT_RMAPBT feature (4.8) > > > > > protects RUI log items, and the REFLINK feature (4.9) protects CUI/BUI > > > > > log items, which is why we can't allow older kernels to recover them. > > > > > > > > Ok, this would work for kernels that don't know waht the > > > > REFLINK/RMAP features are, but upstream kernels will never fail to > > > > recover these items because these are known ROCOMPAT bits. > > > > > > > > The reason this problem exists is that we've accidentally > > > > conflated RO_COMPAT with LOG_INCOMPAT. If RUI/CUI/BUI creation had > > > > of set a log incompat bit whenever they are used (similar to the > > > > new ATTRI stuff setting log incompat bits), then older kernels > > > > would not have allow log recovery even if the reflink/rmap RO_COMPAT > > > > features were set and they didn't understand them. > > > > > > > > However, we can't do that on current kernels because then older > > > > kernels that understand reflink/rmap just fine would see an unknown > > > > log incompat bit and refuse to replay the log. So it comes back to > > > > how we handle unknown ROCOMPAT flags on older kernels, not current > > > > upstream kernels. > > > > > > > > i.e. this patch needs to be backported to kernels that don't know > > > > anything about RMAP/REFLINK to be useful to anyone. i.e. kernels > > > > older than 4.9 that don't know what rmap/reflink are. I suspect > > > > that there are very few supported kernels that old that this might > > > > get backported to. > > > > Seeing as the oldest LTS kernel now is 4.14, I agree with you, let's > > just forget this whole patch and try to remember not to hide LOG > > INCOMPAT features behind RO COMPAT flags again. :) > > > > If we do that, then all we need to do is change xfs_validate_sb_write > > not to complain about unknown rocompat features if the xfs is readonly, > > and remove all the code that temporarily clears the readonly state > > around the log mount calls. > > > > Log recovery then goes back to supporting recovery even in the presence > > of unknown rocompat bits, and patch 3 becomes unnecessary... > > *nod* > > > ...though this below is still true. > > > > > Hmm. The most annoying thing about LOG_INCOMPAT features is that > > > turning them on requires a synchronous write to the primary sb along > > > with a transaction to log the sb that is immediately forced to disk. > > > Every time the log cleans itself it clears the LOG_INCOMPAT features, > > > and then we have to do that /again/. > > > > > > Parent pointers, since they require log intent items to guarantee the > > > dirent and pptr update, cycle the logged xattr LOG_INCOMPAT feature on > > > and off repeatedly. A couple of weeks ago I decided to elide all that > > > LOG_INCOMPAT cycling if parent pointers are enabled, and fstests runtime > > > went from 4.9 hours back down to 4.4. (Parent pointers, for whatever > > > reason, got an INCOMPAT feature bit so it's ok). I was a little > > > surprised that xfs_log_clean ran that much, but there we go. > > Sure, but that's only a problem for operations that are a pure > log format change. With parent pointers, we have an INCOMPAT bit > because we have a new attr filter bit that older kernels will flag > as corruption, and that means we don't need a log incompat bit for > the attr logging. IOWs, if xfs_has_parent_pointers() is true, then > we don't need to set the ATTRI log incompat bit, ever, because the > parent pointer incompat feature bit implies ATTRI log items are in use and all > kernels that understand the PP incompat bit also understand the > ATTRI log items.... Aha, that's why parent pointers got an INCOMPAT flag, thanks for the reminder. > > > A different way to solve the cycling problem could be to start a timer > > > after the last caller drops l_incompat_xattrs and only clear the feature > > > bit after 5 minutes of idleness or unmount, instead of the next time the > > > log cleans itself. > > Well, we only clear it from the xfs_log_worker() if the log needs > covering, so the AIL and iclogs need to be empty before the worker > will clear the incompat bit. That's on a 30s timer already, so > perhaps all we need to do is extend the log covering timeout if > there are incompat log flags set.... Well for non-pptr LARP I don't care since it's a debugging flag, but I suppose for swapext we might want to consider something like that. Though I think I might want to preserve the "30 seconds until log commits" default behavior and not crank that up to 5 minutes. > > > Alternately, we drop this patch and declare an INCOMPAT_RMAPREFLINKV2 > > > feature that wraps up all the other broken bits that we've found since > > > 2016 (overly large log reservations, incorrect units in xattr block > > > reservation calculations, etc.) > > It's tempting, but let's try to put that off until we really need > to.... Ok. --D > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx