On Thu, Dec 12, 2024 at 10:11:02PM -0800, Christoph Hellwig wrote: > On Thu, Dec 12, 2024 at 05:00:35PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > Create a new space reservation scheme so that btree metadata for the > > realtime volume can reserve space in the data device to avoid space > > underruns. > > Can you explain this scheme a bit more here? Back when we were testing the rmap and refcount btrees for the data device, Dave and I observed occasional shutdowns when xfs_btree_split would be called for either of those two btrees. This happened when certain operations (mostly writeback ioends) created new rmap or refcount records, which would expand the size of the btree. If there were no free blocks available the allocation would fail and the split would shut down the filesystem. We thought about pre-reserving blocks for btree expansion at the time of a write() call, but there wasn't any good way to attach the blocks to the inode and keep them there all the way to ioend processing. Unlike delalloc reservations which have that indlen mechanism, there's no way to do that for mapped extents; and indlen blocks are given back during the delalloc -> unwritten transition. Therefore, we chose to reserve sufficient blocks for rmap/refcount btree expansion at mount time. This is what the XFS_AG_RESV_* flags provide; any expansion of those two btrees can come from the pre-reserved space. This patch brings that pre-reservation ability to inode-rooted btrees so that the rt rmap and refcount btrees can also save room for future expansion. How about I put a somewhat massaged version of this into the commit log? --D