Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > While basic NFSv4 does allow you to pretend there is a fundamental > underlying block size, pNFS has changed all that, and we have had to > engineer support for determining the I/O block size on the fly, and > building the RPC requests accordingly. Client side mirroring just adds > to the fun. I've been working with Jeff to make netfslib handle ceph with its distributed object model as well as 9p and afs with their more traditionally-appearing flat files. > However let's start with the "why?" question first. Why do I need an > extra layer of abstraction between NFS and the VM, when one of my > primary concerns right now is that the stack depth keeps growing? It's not exactly an extra layer - it's more a case of taking the same layer out of five[*] network filesystems, combining them and sharing it. [*] up to 7 if I can roll it out into orangefs and/or fuse as well. As to why, well I kind of covered that, but we want to add some services to network filesystems (such as content encryption) and rather than adding separately to all five, there exists the possibility of just doing it the once and sharing it (granted there may be parts that can't be shared). But also, I need to fix cachefiles - and I can't do that whilst nfs is operating on a page-by-page basis. Cachefiles has to have an early say on the size and shape of a transaction. And speaking of content encryption, if you're using a local cache and content encryption, you really don't want the unencrypted data to be stored in your local cache on your laptop, say - so that requires storage of the encrypted data into the cache. Further, the VM folks would like the PG_private_2 bit back, along with PG_checked and PG_error. So we need a different way of managing writes to the cache and preventing overlapping DIO writes. > What problems would any of this solve for NFS? I'm worried about the > cost of all this proposed code churn as well; as you said 'it is > complicated stuff', mainly for the good reason that we've been > optimising a lot of code over the last 25-30 years. First off, NFS would get to partake of services being implemented in netfslib. Granted, this isn't exactly solving problems in NFS, more providing additional features. Secondly, shared code means less code - and the code would, in theory, be better-tested as it would have more users. Thirdly, it would hopefully reduce the maintenance burden, particularly for the VM people. David