On Wed, Oct 1, 2014 at 10:07 AM, Jan Kara <jack@xxxxxxx> wrote: > On Wed 01-10-14 07:50:21, Trond Myklebust wrote: >> On Wed, Oct 1, 2014 at 5:05 AM, Jan Kara <jack@xxxxxxx> wrote: >> > On Tue 30-09-14 17:13:19, Trond Myklebust wrote: >> >> On Tue, Sep 30, 2014 at 4:53 PM, Zach Brown <zab@xxxxxxxxx> wrote: >> >> > On Tue, Sep 30, 2014 at 12:48:45PM +0200, Jan Kara wrote: >> >> >> On Tue 30-09-14 10:11:32, Thanos Makatos wrote: >> >> >> > >> >> >> > Regarding extending the ioctl to invalidate the page cache, do you have >> >> >> > any suggestions where I could start looking? >> >> >> You just need to call invalidate_inode_pages2(). That is going to do all >> >> >> you need. >> >> >> >> >> >> > Would such a new ioctl have any chance to be accepted upstream? >> >> >> I believe a possibility for a file to be fully flushed from page cache is >> >> >> useful at times and if you present well your usecase there are reasonable >> >> >> chances it will get accepted upstream. >> >> > >> >> > Agreed, this seems reasonable. How many times have we all dropped our >> >> > entire cache just 'cause we didn't have a more precise tool? >> >> > >> >> > $ grep -ri drop_caches xfstests/ >> >> > xfstests/src/fsync-tester.c: if ((fd = open("/proc/sys/vm/drop_caches", O_WRONLY)) < 0) { >> >> > xfstests/src/stale_handle.c: system("echo 3 > /proc/sys/vm/drop_caches"); >> >> > xfstests/common/quota: echo 3 > /proc/sys/vm/drop_caches >> >> > >> >> > The last one even says: >> >> > >> >> > # XXX: really need an ioctl instead of this big hammer >> >> > echo 3 > /proc/sys/vm/drop_caches >> >> > >> >> > :) >> >> > >> >> >> >> It would definitely be useful for NFS, however we'd want the option of >> >> clearing the cached metadata too (acls, mode bits, owner/group owner, >> >> etc.) >> > Hum, I can imagine you might be somehow successful in flushing ACLs but >> > how would you like to flush mode or owner? It's not like you can just >> > reload inode from disk and overwrite what you have in memory. What would >> > look sane is to push inode with all metadata it caches out of memory but >> > once somehow holds the inode (and ioctl() itself will have a file handle of >> > the inode), you just cannot. So I'm not sure if you meant something >> > different or if this was just a wish without deeper thought. >> >> You can and you _must_ reload the inode if it changes on the server. >> That's what makes distributed filesystems "interesting" as far as >> caching goes. What we do in the cases where we think the file metadata >> may have changed, is to mark the existing cached metadata as being >> stale so that we know that we need to revalidate before trying to use >> it. > Ah OK, I somehow thought about revalidation automatically happening for > any filesystem and found that impossible. I can imagine doing this > specifically for NFS which is designed to handle such things is possible :) > >> So being able to tell in the ioctl() whether or not the application >> thinks the data or metadata (or both) may have changed on the remote >> disk is actually a very useful feature if you are trying to do things >> like distributed compiles. >> >> I've had an experimental NFS-only ioctl() based caching interface >> kicking around in my git repository for a couple of years now >> (http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=shortlog;h=refs/heads/ioctl). >> If we are planning on doing something at the VFS level, then it would >> be nice to at least duplicate the functionality from the first 2 >> patches. > Thanks for the pointer. The first two patches look useful and I think we > can design the ioctl interface to accommodate the NFS usecase (although most > filesystems will just return EOPNOTSUPP in case someone asks for metadata > invalidation). > > The interface in your patches looks mostly OK, I'd just rather use 'flags' > than 'cmd'. So something like: > struct invalidate_arg { > u64 flags; > u64 start; > u64 len; > }; > > Where 'flags' can be: > INVAL_METADATA - invalidate inode itself, acls, etc. if fs supports it > (this can be more finegrained if that's useful. Is it?) > INVAL_DATA - invalidate data in range <start, start+len) If these are flags rather than a command, then is the intention that they can be ORed together so that you can force a metadata+data revalidate in a single function call? > We can have a fs callback that gets called for this ioctl to do the work, > most filesystems would just leave it at NULL which would fall back to the > default function which just handles INVAL_DATA and returns EOPNOTSUPP if > INVAL_METADATA is set. NFS could then implement its own thing for > INVAL_METADATA and call the generic function for handling INVAL_DATA. > Thoughts? Sounds fine to me. -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html