On Wed, Oct 26, 2022 at 11:46:03AM +1100, Dave Chinner wrote: > On Mon, Oct 24, 2022 at 02:33:37PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > If we're in the middle of a deferred refcount operation and decide to > > roll the transaction to avoid overflowing the transaction space, we need > > to check the new agbno/aglen parameters that we're about to record in > > the new intent. Specifically, we need to check that the new extent is > > completely within the filesystem, and that continuation does not put us > > into a different AG. > > Huh. Why would they not be withing the filesystem or AG, given that > the current transaction should only be operating on an extent within > the current filesystem/AG? Math errors. Callers of xfs_refcount_adjust_extents are supposed to split (and update) any refcount records that cross the start or end of the range that we're adjusting. This guarantees that _adjust_extents will stop at exactly the end of a refcount record. However, if a record in the middle of that range has a blockcount that is longer than *aglen at the point that we encounter the record, we'll increment agbno beyond the range that we were supposed to adjust. This happens either because we fuzzed the refcountbt deliberately, or ... because we actually did write the refcountbt record block but due to bugs on the vm host, the write was silently dropped and memory pressure caused the xfs_buf to get reclaimed and reloaded with stale contents. (That's an argument for making xfs_refcount_adjust_extents check that *aglen never underflows; I'll update the patch. Oh. I already wrote that patch, but forgot to add it to this series. Sigh.) If agbno is now large enough that the segmented xfs_fsblock_t points into a nonexistent AG, bad things will happen. > IIUC, this is code intended to catch the sort of refcount irec > startblock corruption that the series fixes, right? If so, shouldn't it be > first in the patch series, not last? <shrug> Based on reviewer feedback over the last few years I got in the habit of putting the actual bug fixes at the front of the series. Patches adding more layers of checking so that we can die gracefully ended up after that. --D > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx