On Mon, Jul 1, 2024 at 12:15 PM Sweet Tea Dorminy <sweettea-kernel@xxxxxxxxxx> wrote: > > On 7/1/24 12:48 PM, Joanne Koong wrote: > > On Fri, Jun 28, 2024 at 11:03 AM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > >> > >> On Thu, Jun 27, 2024 at 05:13:55PM -0700, Joanne Koong wrote: > >>> Introduce the capability to dynamically configure the FUSE_MAX_MAX_PAGES > >>> limit through a sysctl. This enhancement allows system administrators to > >>> adjust the value based on system-specific requirements. > >>> > >>> This removes the previous static limit of 256 max pages, which limits > >>> the max write size of a request to 1 MiB (on 4096 pagesize systems). > >>> Having the ability to up the max write size beyond 1 MiB allows for the > >>> perf improvements detailed in this thread [1]. > >>> > >>> $ sysctl -a | grep fuse_max_max_pages > >>> fs.fuse.fuse_max_max_pages = 256 > >>> > >>> $ sysctl -n fs.fuse.fuse_max_max_pages > >>> 256 > >>> > >>> $ echo 1024 | sudo tee /proc/sys/fs/fuse/fuse_max_max_pages > >>> 1024 > >>> > >>> $ sysctl -n fs.fuse.fuse_max_max_pages > >>> 1024 > >>> > >>> [1] https://lore.kernel.org/linux-fsdevel/20240124070512.52207-1-jefflexu@xxxxxxxxxxxxxxxxx/T/#u > >>> > >>> Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > >> > >> Overall the change is great, and I see why you named it the way you did, but I > >> think may be 'fuse_max_pages_limit' may be a better name? The original constant > >> name wasn't great, but it was fine in its context. I think having it as an > >> interface we should name it something less silly. > > > > 'fuse_max_pages_limit' sounds great to me. I'll submit v2 with this rename. > > > > Whatever-it's-called might make sense to be in bytes, since max_read, > max_write are the values visible to a userspace fuse server and are in > bytes. I like this in theory but in practice, I think this adds more logic/complexity in the kernel than it's worth, since we will need to keep track of and enforce non-page-size-aligned values that may be set as the limit. For example, if the sysctl value gets set to 1048577 bytes (eg 1 MiB + 1 byte) and the user sets "max_write" to 1048578 bytes in the init callback, we need to make sure we cap writes to exactly 1048577 bytes, which means we'll need to keep track of this in the kernel at the byte granularity. Additionally, there are some places where fc->max_pages is used (eg for fuse_readahead) where it doesn't quite make sense to have this limit be a non-page-size-aligned number. My personal vote is to keep it simple with page size granularity, but I'm happy to talk more about this if you or anyone else feels strongly. > > Additionally, the sysctl should probably disallow values at or above > 1<<16 pages -- fuse_init_out expects that max_pages fits in a u16, and > there are a couple of places where fc->max_pages << PAGE_SHIFT is > expected to fit in a u32. fc->max_pages is constrained by max_max_pages > at present so this is true but would not be true if max_max_pages is > only constrained by being a uint. Good point! I missed that fc->max_pages will be bounded anyways by the uint16_t "fuse_init_out->max_pages", so setting larger values than u16 won't do anything. I'll make this change in v2. (For the fc->max_pages << PAGE_SHIFT being expected to fit in a u32, I only see that for fuse_verify_ioctl_iov() but that function looks like it should be defining "max" to size_t instead of a u32 and then casting to size_t? Not relevant to this conversation since I'll be changing it to u16 but I was just curious) Thanks for taking a look!