On 5/28/24 10:21 PM, Peter-Jan Gootzen wrote: > On Tue, 2024-05-28 at 21:35 +0800, Jingbo Xu wrote: >> External email: Use caution opening links or attachments >> >> >> On 5/28/24 7:59 PM, Miklos Szeredi wrote: >>> On Tue, 28 May 2024 at 10:52, Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> >>> wrote: >>>> >>>> Hi Peter-Jan, >>>> >>>> Thanks for the amazing work. >>>> >>>> I'd just like to know if you have any plan of making fiq and fiq- >>>>> lock >>>> more scalable, e.g. make fiq a per-CPU software queue? >>> >>> Doing a per-CPU queue is not necessarily a good idea: async requests >>> could overload one queue while others are idle. >> >> It is in FUSE scenarios (instead of virtiofs). There's no 1:1 mapping >> between CPUs and daemon threads. All requests submitted from all CPUs >> are enqueued in one global pending list, and all daemon threads fetch >> request to be processed from this global pending list. > Virtiofsd also works in this thread pool manner with a global pending > list and many daemon threads fetching from said list. > >> >>> >>> One idea is to allow request to go through a per-CPU fast path if >>> the >>> respective listener is idle. Otherwise the request would enter the >>> default slow queue, where idle listeners would pick requests (like >>> they do now). >> >> I guess "listener" refers to one thread of fuse daemon. >> >> >> When coming to virtiofs scenarios, there's 1:1 mapping between CPUs >> and >> hardware queues. The requests will only be routed to the hardware >> queue >> to which the submitter CPU maps, and thus there's no meaning still >> managing pending requests in one global pending list. In this case, >> the >> per-CPU pending list theoretically can reduce the lock contention on >> the >> global pending list. >> >> >> I guess we have an internal RFC on per-CPU pending list for virtiofs. >> Let me see if this really improves the performance from test. > I am a bit confused about the software layers we are talking about now. > We have: > - FUSE driver (so the kernel FUSE fs driver) > - virtio-fs driver > - virtiofsd as a software virtio-fs device > - Hardware virtio-fs device implementations (e.g. BlueField) > - libfuse (connecting /dev/fuse and a FS implementation) > - File systems that build upon: libfuse, virtiofsd or some hardware > virtio-fs implementation > > So you have a patch at hand that fixes this fiq and fiq->lock bottleneck > in the FUSE driver, correct? In the virtio-fs driver there is no pending > list, it just pops from the pending list of the FUSE driver. I mean fiq->pending list (struct fuse_iqueue) when I refer to software pending list. __fuse_request_send spin_lock(&fiq->lock) <-- lock contention if multiple CPUs submit IOs queue_request_and_unlock # enqueue request to fiq->pending list fiq->ops->wake_pending_and_unlock(), i.e. virtio_fs_wake_pending_and_unlock() for virtiofs # fetch one request from fiq->pending list spin_unlock(&fiq->lock) Both the regular /dev/fuse interface and virtiofs use "struct fuse_iqueue" as the pending list. -- Thanks, Jingbo