Re: [PATCH RFC v2 00/19] fuse: fuse-over-io-uring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/12/24 16:07, Miklos Szeredi wrote:
> On Wed, 12 Jun 2024 at 15:33, Bernd Schubert <bschubert@xxxxxxx> wrote:
> 
>> I didn't do that yet, as we are going to use the ring buffer for requests,
>> i.e. the ring buffer immediately gets all the data from network, there is
>> no copy. Even if the ring buffer would get data from local disk - there
>> is no need to use a separate application buffer anymore. And with that
>> there is just no extra copy
> 
> Let's just tackle this shared request buffer, as it seems to be a
> central part of your design.
> 
> You say the shared buffer is used to immediately get the data from the
> network (or various other sources), which is completely viable.
> 
> And then the kernel will do the copy from the shared buffer.  Single copy, fine.
> 
> But if the buffer wasn't shared?  What would be the difference?
> Single copy also.
> 
> Why is the shared buffer better?  I mean it may even be worse due to
> cache aliasing issues on certain architectures.  copy_to_user() /
> copy_from_user() are pretty darn efficient.

Right now we have:

- Application thread writes into the buffer, then calls io_uring_cmd_done

I can try to do without mmap and set a pointer to the user buffer in the 
80B section of the SQE. I'm not sure if the application is allowed to 
write into that buffer, possibly/probably we will be forced to use 
io_uring_cmd_complete_in_task() in all cases (without 19/19 we have that 
anyway). My greatest fear here is that the extra task has performance 
implications for sync requests.


> 
> Why is it better to have that buffer managed by kernel?  Being locked
> in memory (being unswappable) is probably a disadvantage as well.  And
> if locking is required, it can be done on the user buffer.

Well, let me try to give the buffer in the 80B section.

> 
> And there are all the setup and teardown complexities...

If the buffer in the 80B section works setup becomes easier, mmap and 
ioctls go away. Teardown, well, we still need the workaround as we need 
to handle io_uring_cmd_done, but if you could live with that for the 
instance, I would ask Jens or Pavel or Ming for help if we could solve 
that in io-uring itself.
Is the ring workaround in fuse_dev_release() acceptable for you? Or do 
you have any another idea about it?

> 
> Note: the ring buffer used by io_uring is different.  It literally
> allows communication without invoking any system calls in certain
> cases.  That shared buffer doesn't add anything like that.  At least I
> don't see what it actually adds.
> 
> Hmm?

The application can write into the buffer. We won't shared queue buffers 
if we could solve the same with a user pointer.


Thanks,
Bernd




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux