Re: fuse uring / wake_up on the same core

Bernd Schubert <bschubert@xxxxxxx> · Wed, 26 Apr 2023 22:40:32 +0000

On 3/27/23 12:28, Peter Zijlstra wrote:
> On Fri, Mar 24, 2023 at 07:50:12PM +0000, Bernd Schubert wrote:
> 
>> With the fuse-uring patches that part is basically solved - the waitq
>> that that thread is about is not used anymore. But as per above,
>> remaining is the waitq of the incoming workq (not mentioned in the
>> thread above). As I wrote, I have tried
>> __wake_up_sync((x), TASK_NORMAL), but it does not make a difference for
>> me - similar to Miklos' testing before. I have also tried struct
>> completion / swait - does not make a difference either.
>> I can see task_struct has wake_cpu, but there doesn't seem to be a good
>> interface to set it.
>>
>> Any ideas?
> 
> Does the stuff from:
> 
>    https://lkml.kernel.org/r/20230308073201.3102738-1-avagin@xxxxxxxxxx

Thanks Peter, I have already replied in that thread - using 
__wake_up_on_current_cpu() helps to avoid cpu migrations. Well, some 
update since my last mail in that thread (a few hours ago), more logging 
reveals that I still see a few cpu switches, but nothing compared to 
what I had before.
My issue is now that these patches are not enough and contrary to 
previous testing, forcefully disabling cpu migration using 
migrate_disable() before wait_event_* in fuse's request_wait_answer()
and enabling it after does not help either - my process to create files
(bonnie++) somewhere migrates to another cpu at a later time.
The only workaround I currently have is to set the ring thread 
processing vfs/fuse events in userspace to SCHED_IDLE. In combination 
with WF_CURRENT_CPU performance then goes from ~2200 to ~9000 file 
creates/s for a single thread in the latest branch (should be scalable). 
Which is very close to binding the bonnie++ process to a single core 
(~9400 creates/s).

Is there something available to mark ring threads as IO processing and 
that there is no need to migrate away the submitting thread from IO 
threads?

* application sends request -> forwards to ring and wake ring  -> wait
* ring wakes up (core bound) -> process request -> sends completion -> 
wake up application -> wait for next request
* application wakes up with request result

==> I don't understand why the application is moved to another process 
at all, after the wake issue is eliminated.

I also only see SCHED_IDLE only as workaround, as it would likely have 
side effects if there is anything else running on the system and would 
consume cpus while another process is doing IO.
Is there a way to trace where and why a process is migrated away?

Thanks,
Bernd