RE: [EXT] Re: FYI, fsnotify contention with aio and io_uring.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Had some time to re-do some testing.

1) Pipewire (its wireplumber deamon) set a watch on the children of the directory /dev via inotify.
I removed that (disabled pipewire), but still had the fsnotify overhead when using aio/io_ring at high IOPS across several threads on several cores.

2) I then noticed that udev set a watch (via inotify) on the files in /dev.
This is due to a rule in /usr/lib/udev/rules.d/60-block.rules
# watch metadata changes, caused by tools closing the device node which was opened for writing
ACTION!="remove", SUBSYSTEM=="block", \
  KERNEL=="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|ubi*|scm*|pmem*|nbd*|zd*", \
  OPTIONS+="watch"
I removed "nvme*" from this rule (I am testing on /dev/nvme0n1), then finally the fsnotify overhead disappeared.

3) I think there is nothing wrong with Pipewire and udev, they simply want to watch what is going on in /dev.
I don't think they are interested in (and it is not the goal/charter of fsnotify) quantifying millions of read/write accesses/sec to a file they watch. There are other tools for that, that are optimized for that task.

I think to avoid the overhead, the fsnotify subsystem should be refined to factor high frequency read/write file access.
Or piece of code (like aio/io_uring) doing high frequency fsnotify should do the factoring themselves.
Or the user should be given a way to turn off fsnotify calls for read/write on specific file.


Now, the only way to work around the cpu overhead without hacking, is to disable services watching /dev.
That means people can't use these services anymore. Doesn't seem right.

Regards,

Pierre


> -----Original Message-----
> From: Pierre Labat
> Sent: Monday, August 14, 2023 9:31 AM
> To: Jeff Moyer <jmoyer@xxxxxxxxxx>
> Cc: Jens Axboe <axboe@xxxxxxxxx>; 'io-uring@xxxxxxxxxxxxxxx' <io-
> uring@xxxxxxxxxxxxxxx>
> Subject: RE: [EXT] Re: FYI, fsnotify contention with aio and io_uring.
> 
> Hi Jeff,
> 
> Indeed, by default, in my configuration, pipewire is running.
> When I can re-test, I'll disabled it and see if that remove the problem.
> Thanks for the hint!
> 
> Pierre
> 
> > -----Original Message-----
> > From: Jeff Moyer <jmoyer@xxxxxxxxxx>
> > Sent: Wednesday, August 9, 2023 10:15 AM
> > To: Pierre Labat <plabat@xxxxxxxxxx>
> > Cc: Jens Axboe <axboe@xxxxxxxxx>; 'io-uring@xxxxxxxxxxxxxxx' <io-
> > uring@xxxxxxxxxxxxxxx>
> > Subject: Re: [EXT] Re: FYI, fsnotify contention with aio and io_uring.
> >
> > CAUTION: EXTERNAL EMAIL. Do not click links or open attachments unless
> > you recognize the sender and were expecting this message.
> >
> >
> > Pierre Labat <plabat@xxxxxxxxxx> writes:
> >
> > > Micron Confidential
> > >
> > > Hi Jeff and Jens,
> > >
> > > About "FAN_MODIFY fsnotify watch set on /dev".
> > >
> > > Was using Fedora34 distro (with 6.3.9 kernel), and fio. Without any
> > particular/specific setting.
> > > I tried to see what could watch /dev but failed at that.
> > > I used the inotify-info tool, but that display watchers using the
> > > inotify interface. And nothing was watching /dev via inotify.
> > > Need to figure out how to do the same but for the fanotify interface.
> > > I'll look at it again and let you know.
> >
> > You wouldn't happen to be running pipewire, would you?
> >
> > https://urldefense.com/v3/__https://gitlab.freedesktop.org/pipewire/pi
> > pewir
> > e/-
> > /commit/88f0dbd6fcd0a412fc4bece22afdc3ba0151e4cf__;!!KZTdOCjhgt4hgw!6E
> > 063jj
> > -_XK1NceWzms7DaYacILy4cKmeNVA3xalNwkd0zrYTX-IouUnvJ8bZs-RG3YSdk5XpFoo$
> >
> > -Jeff
> >
> > >
> > > Regards,
> > >
> > > Pierre
> > >
> > >
> > >
> > > Micron Confidential
> > >> -----Original Message-----
> > >> From: Jens Axboe <axboe@xxxxxxxxx>
> > >> Sent: Tuesday, August 8, 2023 2:41 PM
> > >> To: Jeff Moyer <jmoyer@xxxxxxxxxx>; Pierre Labat
> > >> <plabat@xxxxxxxxxx>
> > >> Cc: 'io-uring@xxxxxxxxxxxxxxx' <io-uring@xxxxxxxxxxxxxxx>
> > >> Subject: [EXT] Re: FYI, fsnotify contention with aio and io_uring.
> > >>
> > >> CAUTION: EXTERNAL EMAIL. Do not click links or open attachments
> > >> unless you recognize the sender and were expecting this message.
> > >>
> > >>
> > >> On 8/7/23 2:11?PM, Jeff Moyer wrote:
> > >> > Hi, Pierre,
> > >> >
> > >> > Pierre Labat <plabat@xxxxxxxxxx> writes:
> > >> >
> > >> >> Hi,
> > >> >>
> > >> >> This is FYI, may be you already knows about that, but in case
> > >> >> you
> > >> don't....
> > >> >>
> > >> >> I was pushing the limit of the number of nvme read IOPS, the FIO
> > >> >> + the Linux OS can handle. For that, I have something special
> > >> >> under the Linux nvme driver. As a consequence I am not limited
> > >> >> by whatever the NVME SSD max IOPS or IO latency would be.
> > >> >>
> > >> >> As I cranked the number of system cores and FIO jobs doing
> > >> >> direct 4k random read on /dev/nvme0n1, I hit a wall. The IOPS
> > >> >> scaling slows (less than linear) and around 15 FIO jobs on 15
> > >> >> core threads, the overall IOPS, in fact, goes down as I add more
> > >> >> FIO jobs. For example on a system with 24 cores/48 threads, when
> > >> >> I goes beyond 15 FIO jobs, the overall IOPS starts to go down.
> > >> >>
> > >> >> This happens the same for io_uring and aio. Was using kernel
> > >> >> version
> > >> 6.3.9. Using one namespace (/dev/nvme0n1).
> > >> >
> > >> > [snip]
> > >> >
> > >> >> As you can see 76% of the cpu on the box is sucked up by
> > >> >> lockref_get_not_zero() and lockref_put_return().  Looking at the
> > >> >> code, there is contention when IO_uring call fsnotify_access().
> > >> >
> > >> > Is there a FAN_MODIFY fsnotify watch set on /dev?  If so, it
> > >> > might be a good idea to find out what set it and why.
> > >>
> > >> This would be my guess too, some distros do seem to do that. The
> > >> notification bits scale horribly, nobody should use it for anything
> > >> high performance...
> > >>
> > >> --
> > >> Jens Axboe





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux