On Wed, Aug 07, 2024 at 06:27:21AM +0200, Mateusz Guzik wrote: > I'm looking at false-sharing problems concerning multicore open + read + > close cycle on one inode and during my survey I found xfs is heavily > serializing on a spinlock in xfs_release, making it perform the worst > out of the btrfs/ext4/xfs trio. > > A trivial test case plopped into will-it-scale is at the end. > > bpftrace -e 'kprobe:__pv_queued_spin_lock_slowpath { @[kstack()] = count(); }' tells me: > [snip] > @[ > __pv_queued_spin_lock_slowpath+5 > _raw_spin_lock_irqsave+49 > rwsem_wake.isra.0+57 > up_write+69 > xfs_iunlock+244 > xfs_release+175 > __fput+238 > __x64_sys_close+60 > do_syscall_64+82 > entry_SYSCALL_64_after_hwframe+118 > ]: 41132 > @[ > __pv_queued_spin_lock_slowpath+5 > _raw_spin_lock_irq+42 > rwsem_down_read_slowpath+164 > down_read+72 > xfs_ilock+125 > xfs_file_buffered_read+71 > xfs_file_read_iter+115 > vfs_read+604 > ksys_read+103 > do_syscall_64+82 > entry_SYSCALL_64_after_hwframe+118 > ]: 137639 > @[ > __pv_queued_spin_lock_slowpath+5 > _raw_spin_lock+41 > xfs_release+196 > __fput+238 > __x64_sys_close+60 > do_syscall_64+82 > entry_SYSCALL_64_after_hwframe+118 > ]: 1432766 > > The xfs_release -> _raw_spin_lock thing is the XFS_ITRUNCATED flag test. Yeah, these all ring old bells in the back of my skull. > > test case (plop into will-it-scale, say tests/openreadclose3.c and run > ./openreadclose3_processes -t 24): > > #include <stdlib.h> > #include <unistd.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <assert.h> > > #define BUFSIZE 1024 > > static char tmpfile[] = "/tmp/willitscale.XXXXXX"; > > char *testcase_description = "Same file open/read/close"; > > void testcase_prepare(unsigned long nr_tasks) > { > char buf[BUFSIZE]; > int fd = mkstemp(tmpfile); > > assert(fd >= 0); > memset(buf, 'A', sizeof(buf)); > assert(write(fd, buf, sizeof(buf)) == sizeof(buf)); > close(fd); > } > > void testcase(unsigned long long *iterations, unsigned long nr) > { > char buf[BUFSIZE]; > > while (1) { > int fd = open(tmpfile, O_RDONLY); > assert(fd >= 0); > assert(read(fd, buf, sizeof(buf)) == sizeof(buf)); > close(fd); Oh, yeah, I defintely sent patches once upon a time to address this. <scrummage around old patch stacks> Yep, there it is: https://lore.kernel.org/linux-xfs/20190207050813.24271-4-david@xxxxxxxxxxxxx/ This would completely remove the rwsem traffic from O_RDONLY file closes. None of it would address the XFS_ITRUNCATED contention issue, but that's just another of those "test, test-and-clear" cases that avoid the atomic ops by testing if the bit is set without the lock first.... Hmmm, I thought I saw these patches go past on the list again recently. Yeah, that was a coupl eof months ago: https://lore.kernel.org/linux-xfs/20240623053532.857496-1-hch@xxxxxx/ Christoph, any progress on merging that patchset? -Dave. -- Dave Chinner david@xxxxxxxxxxxxx