Pierre Labat <plabat@xxxxxxxxxx> writes: > Micron Confidential > > Hi Jeff and Jens, > > About "FAN_MODIFY fsnotify watch set on /dev". > > Was using Fedora34 distro (with 6.3.9 kernel), and fio. Without any particular/specific setting. > I tried to see what could watch /dev but failed at that. > I used the inotify-info tool, but that display watchers using the > inotify interface. And nothing was watching /dev via inotify. > Need to figure out how to do the same but for the fanotify interface. > I'll look at it again and let you know. You wouldn't happen to be running pipewire, would you? https://gitlab.freedesktop.org/pipewire/pipewire/-/commit/88f0dbd6fcd0a412fc4bece22afdc3ba0151e4cf -Jeff > > Regards, > > Pierre > > > > Micron Confidential >> -----Original Message----- >> From: Jens Axboe <axboe@xxxxxxxxx> >> Sent: Tuesday, August 8, 2023 2:41 PM >> To: Jeff Moyer <jmoyer@xxxxxxxxxx>; Pierre Labat <plabat@xxxxxxxxxx> >> Cc: 'io-uring@xxxxxxxxxxxxxxx' <io-uring@xxxxxxxxxxxxxxx> >> Subject: [EXT] Re: FYI, fsnotify contention with aio and io_uring. >> >> CAUTION: EXTERNAL EMAIL. Do not click links or open attachments unless you >> recognize the sender and were expecting this message. >> >> >> On 8/7/23 2:11?PM, Jeff Moyer wrote: >> > Hi, Pierre, >> > >> > Pierre Labat <plabat@xxxxxxxxxx> writes: >> > >> >> Hi, >> >> >> >> This is FYI, may be you already knows about that, but in case you >> don't.... >> >> >> >> I was pushing the limit of the number of nvme read IOPS, the FIO + >> >> the Linux OS can handle. For that, I have something special under the >> >> Linux nvme driver. As a consequence I am not limited by whatever the >> >> NVME SSD max IOPS or IO latency would be. >> >> >> >> As I cranked the number of system cores and FIO jobs doing direct 4k >> >> random read on /dev/nvme0n1, I hit a wall. The IOPS scaling slows >> >> (less than linear) and around 15 FIO jobs on 15 core threads, the >> >> overall IOPS, in fact, goes down as I add more FIO jobs. For example >> >> on a system with 24 cores/48 threads, when I goes beyond 15 FIO jobs, >> >> the overall IOPS starts to go down. >> >> >> >> This happens the same for io_uring and aio. Was using kernel version >> 6.3.9. Using one namespace (/dev/nvme0n1). >> > >> > [snip] >> > >> >> As you can see 76% of the cpu on the box is sucked up by >> >> lockref_get_not_zero() and lockref_put_return(). Looking at the >> >> code, there is contention when IO_uring call fsnotify_access(). >> > >> > Is there a FAN_MODIFY fsnotify watch set on /dev? If so, it might be >> > a good idea to find out what set it and why. >> >> This would be my guess too, some distros do seem to do that. The >> notification bits scale horribly, nobody should use it for anything high >> performance... >> >> -- >> Jens Axboe