Re: [PATCH] fuse: Allow to align reads/writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 7/4/24 17:10, Josef Bacik wrote:
On Wed, Jul 03, 2024 at 10:44:28PM +0200, Bernd Schubert wrote:


On 7/3/24 22:28, Joanne Koong wrote:
On Wed, Jul 3, 2024 at 11:08 AM Bernd Schubert <bschubert@xxxxxxx> wrote:

On 7/3/24 19:49, Joanne Koong wrote:
On Wed, Jul 3, 2024 at 10:30 AM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:

On Wed, Jul 03, 2024 at 05:58:20PM +0200, Bernd Schubert wrote:


On 7/3/24 17:15, Josef Bacik wrote:
On Tue, Jul 02, 2024 at 06:31:08PM +0200, Bernd Schubert wrote:
Read/writes IOs should be page aligned as fuse server
might need to copy data to another buffer otherwise in
order to fulfill network or device storage requirements.

Simple reproducer is with libfuse, example/passthrough*
and opening a file with O_DIRECT - without this change
writing to that file failed with -EINVAL if the underlying
file system was using ext4 (for passthrough_hp the
'passthrough' feature has to be disabled).

Given this needs server side changes as new feature flag is
introduced.

Disadvantage of aligned writes is that server side needs
needs another splice syscall (when splice is used) to seek
over the unaligned area - i.e. syscall and memory copy overhead.

Signed-off-by: Bernd Schubert <bschubert@xxxxxxx>

---
 From implementation point of view 'struct fuse_in_arg' /
'struct fuse_arg' gets another parameter 'align_size', which has to
be set by fuse_write_args_fill. For all other fuse operations this
parameter has to be 0, which is guranteed by the existing
initialization via FUSE_ARGS and C99 style
initialization { .size = 0, .value = NULL }, i.e. other members are
zero.
Another choice would have been to extend fuse_write_in to
PAGE_SIZE - sizeof(fuse_in_header), but then would be an
arch/PAGE_SIZE depending struct size and would also require
lots of stack usage.

Can I see the libfuse side of this?  I'm confused why we need the align_size at
all?  Is it enough to just say that this connection is aligned, negotiate what
the alignment is up front, and then avoid sending it along on every write?

Sure, I had forgotten to post it
https://github.com/bsbernd/libfuse/commit/89049d066efade047a72bcd1af8ad68061b11e7c

We could also just act on fc->align_writes / FUSE_ALIGN_WRITES and always use
sizeof(struct fuse_in_header) + sizeof(struct fuse_write_in) in libfuse and would
avoid to send it inside of fuse_write_in. We still need to add it to struct fuse_in_arg,
unless you want to check the request type within fuse_copy_args().

I think I like this approach better, at the very least it allows us to use the
padding for other silly things in the future.


This approach seems cleaner to me as well.
I also like the idea of having callers pass in whether alignment
should be done or not to fuse_copy_args() instead of adding
"align_writes" to struct fuse_in_arg.

There is no caller for FUSE_WRITE for fuse_copy_args(), but it is called
from fuse_dev_do_read for all request types. I'm going to add in request
parsing within fuse_copy_args, I can't decide myself which of both
versions I like less.

Sorry I should have clarified better :) By callers, I meant callers to
fuse_copy_args(). I'm still getting up to speed with the fuse code but
it looks like it gets called by both fuse_dev_do_read and
fuse_dev_do_write (through copy_out_args() -> fuse_copy_args()). The
cleanest solution to me seems like to pass in from those callers
whether the request should be page-aligned after the headers or not,
instead of doing the request parsing within fuse_copy_args() itself. I
think if we do the request parsing within fuse_copy_args() then we
would also need to have some way to differentiate between FUSE_WRITE
requests from the dev_do_read vs dev_do_write side (since, as I
understand it, writes only needs to be aligned for dev_do_read write
requests).

fuse_dev_do_write() is used to submit results from fuse server
(userspace), i.e. not interesting here. If we don't parse in
fuse_copy_args(), we would have to do that in fuse_dev_do_read() - it
doesn't have knowledge about the request it handles either - it just
takes from lists what is there. So if we don't want to have it encoded
in fuse_in_arg, there has to request type checking. Given the existing
number of conditions in fuse_dev_do_read, I would like to avoid adding
in even more there.


Your original alternative I think is better, leave it in fuse_in_arg and take it
out of the write arg.  Thanks,

Thank you! I'm going to send out a new version in one or two days. Currently I believe we don't need the actual alignment size at all and can just use:

fuse_copy_align()
cs->offset += cs->len;
cs->len = 0;

Because offset and len are for for the current page.

I need to ponder about it a bit, checking if there is any exception...


Thanks,
Bernd





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux