On Fri, 2025-02-21 at 10:46 -0500, Chuck Lever wrote: > On 2/21/25 10:36 AM, Mike Snitzer wrote: > > On Fri, Feb 21, 2025 at 10:25:03AM -0500, Jeff Layton wrote: > > > On Fri, 2025-02-21 at 10:02 -0500, Mike Snitzer wrote: > > > > My intent was to make 6.14's DONTCACHE feature able to be > > > > tested in > > > > the context of nfsd in a no-frills way. I realize adding the > > > > nfsd_dontcache knob skews toward too raw, lacks polish. But > > > > I'm > > > > inclined to expose such course-grained opt-in knobs to > > > > encourage > > > > others' discovery (and answers to some of the questions you > > > > pose > > > > below). I also hope to enlist all NFSD reviewers' help in > > > > categorizing/documenting where DONTCACHE helps/hurts. ;) > > > > > > > > And I agree that ultimately per-export control is needed. I'll > > > > take > > > > the time to implement that, hopeful to have something more > > > > suitable in > > > > time for LSF. > > > > > > Would it make more sense to hook DONTCACHE up to the IO_ADVISE > > > operation in RFC7862? IO_ADVISE4_NOREUSE sounds like it has > > > similar > > > meaning? That would give the clients a way to do this on a per- > > > open > > > basis. > > > > Just thinking aloud here but: Using a DONTCACHE scalpel on a per > > open > > basis quite likely wouldn't provide the required page reclaim > > relief > > if the server is being hammered with normal buffered IO. Sure that > > particular DONTCACHE IO wouldn't contribute to the problem but it > > would still be impacted by those not opting to use DONTCACHE on > > entry > > to the server due to needing pages for its DONTCACHE buffered IO. > > For this initial work, which is to provide a mechanism for > experimentation, IMO exposing the setting to clients won't be all > that helpful. > > But there are some applications/workloads on clients where exposure > could be beneficial -- for instance, a backup job, where NFSD would > benefit by knowing it doesn't have to maintain the job's written data > in > its page cache. I regard that as a later evolutionary improvement, > though. > > Jorge proposed adding the NFSv4.2 IO_ADVISE operation to NFSD, but I > think we first need to a) work out and document appropriate semantics > for each hint, because the spec does not provide specifics, and b) > perform some extensive benchmarking to understand their value and > impact. > > That puts the onus on the application running on the client to decide the caching semantics of the server which: A. Is a terrible idea™. The application may know how it wants to use the cached data, and be able to somewhat confidently manage its own pagecache. However in almost all cases, it will have no basis for understanding how the server should manage its cache. The latter really is a job for the sysadmin to figure out. B. Is impractical, because even if you can figure out a policy, it requires rewriting the application to manage the server cache. C. Will require additional APIs on the NFSv4.2 client to expose the IO_ADVISE operation. You cannot just map it to posix_fadvise() and/or posix_madvise(), because IO_ADVISE is designed to manage a completely different caching layer. At best, we might be able to rally one or two more distributed filesystems to implement similar functionality and share an API, however there is no chance this API will be useful for ordinary filesystems. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx