On Tue, Nov 13, 2018 at 11:28:59AM -0800, Omar Sandoval wrote: > From: Omar Sandoval <osandov@xxxxxx> > > The realtime summary is a two-dimensional array on disk, effectively: > > u32 rsum[log2(number of realtime extents) + 1][number of blocks in the bitmap] > > rsum[log][bbno] is the number of extents of size 2**log which start in > bitmap block bbno. > > xfs_rtallocate_extent_near() uses xfs_rtany_summary() to check whether > rsum[log][bbno] != 0 for any log level. However, the summary array is > stored in row-major order (i.e., like an array in C), so all of these > entries are not adjacent, but rather spread across the entire summary > file. In the worst case (a full bitmap block), xfs_rtany_summary() has > to check every level. > > This means that on a moderately-used realtime device, an allocation will > waste a lot of time finding, reading, and releasing buffers for the > realtime summary. In particular, one of our storage services (which runs > on servers with 8 very slow CPUs and 15 8 TB XFS realtime filesystems) > spends almost 5% of its CPU cycles in xfs_rtbuf_get() and > xfs_trans_brelse() called from xfs_rtany_summary(). > > One solution would be to also store the summary with the dimensions > swapped. However, this would require a disk format change to a very old > component of XFS. > > Instead, we can cache the minimum size which contains any extents. We do > so lazily; rather than guaranteeing that the cache contains the precise > minimum, it always contains a loose lower bound which we tighten when we > read or update a summary block. This only uses a few kilobytes of memory > and is already serialized via the realtime bitmap and summary inode > locks, so the cost is minimal. With this change, the same workload only > spends 0.2% of its CPU cycles in the realtime allocator. > > Signed-off-by: Omar Sandoval <osandov@xxxxxx> > --- > Based on Linus' master branch. > > Changes from v2: > - Allow the cache allocation to fail, in which case we just don't use it > > Changes from v1: > - Clarify comment in xfs_rtmount_inodes(). > - Use kmem_* instead of kvmalloc/kvfree > > fs/xfs/libxfs/xfs_rtbitmap.c | 6 ++++++ > fs/xfs/xfs_mount.h | 7 +++++++ > fs/xfs/xfs_rtalloc.c | 25 +++++++++++++++++++++---- > 3 files changed, 34 insertions(+), 4 deletions(-) Ping.