On Wed 01-10-14 10:47:05, Trond Myklebust wrote: > On Wed, Oct 1, 2014 at 10:07 AM, Jan Kara <jack@xxxxxxx> wrote: > > On Wed 01-10-14 07:50:21, Trond Myklebust wrote: > >> On Wed, Oct 1, 2014 at 5:05 AM, Jan Kara <jack@xxxxxxx> wrote: > >> > On Tue 30-09-14 17:13:19, Trond Myklebust wrote: > >> >> On Tue, Sep 30, 2014 at 4:53 PM, Zach Brown <zab@xxxxxxxxx> wrote: > >> >> > On Tue, Sep 30, 2014 at 12:48:45PM +0200, Jan Kara wrote: > >> >> >> On Tue 30-09-14 10:11:32, Thanos Makatos wrote: > >> >> >> > > >> >> >> > Regarding extending the ioctl to invalidate the page cache, do you have > >> >> >> > any suggestions where I could start looking? > >> >> >> You just need to call invalidate_inode_pages2(). That is going to do all > >> >> >> you need. > >> >> >> > >> >> >> > Would such a new ioctl have any chance to be accepted upstream? > >> >> >> I believe a possibility for a file to be fully flushed from page cache is > >> >> >> useful at times and if you present well your usecase there are reasonable > >> >> >> chances it will get accepted upstream. > >> >> > > >> >> > Agreed, this seems reasonable. How many times have we all dropped our > >> >> > entire cache just 'cause we didn't have a more precise tool? > >> >> > > >> >> > $ grep -ri drop_caches xfstests/ > >> >> > xfstests/src/fsync-tester.c: if ((fd = open("/proc/sys/vm/drop_caches", O_WRONLY)) < 0) { > >> >> > xfstests/src/stale_handle.c: system("echo 3 > /proc/sys/vm/drop_caches"); > >> >> > xfstests/common/quota: echo 3 > /proc/sys/vm/drop_caches > >> >> > > >> >> > The last one even says: > >> >> > > >> >> > # XXX: really need an ioctl instead of this big hammer > >> >> > echo 3 > /proc/sys/vm/drop_caches > >> >> > > >> >> > :) > >> >> > > >> >> > >> >> It would definitely be useful for NFS, however we'd want the option of > >> >> clearing the cached metadata too (acls, mode bits, owner/group owner, > >> >> etc.) > >> > Hum, I can imagine you might be somehow successful in flushing ACLs but > >> > how would you like to flush mode or owner? It's not like you can just > >> > reload inode from disk and overwrite what you have in memory. What would > >> > look sane is to push inode with all metadata it caches out of memory but > >> > once somehow holds the inode (and ioctl() itself will have a file handle of > >> > the inode), you just cannot. So I'm not sure if you meant something > >> > different or if this was just a wish without deeper thought. > >> > >> You can and you _must_ reload the inode if it changes on the server. > >> That's what makes distributed filesystems "interesting" as far as > >> caching goes. What we do in the cases where we think the file metadata > >> may have changed, is to mark the existing cached metadata as being > >> stale so that we know that we need to revalidate before trying to use > >> it. > > Ah OK, I somehow thought about revalidation automatically happening for > > any filesystem and found that impossible. I can imagine doing this > > specifically for NFS which is designed to handle such things is possible :) > > > >> So being able to tell in the ioctl() whether or not the application > >> thinks the data or metadata (or both) may have changed on the remote > >> disk is actually a very useful feature if you are trying to do things > >> like distributed compiles. > >> > >> I've had an experimental NFS-only ioctl() based caching interface > >> kicking around in my git repository for a couple of years now > >> (http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=shortlog;h=refs/heads/ioctl). > >> If we are planning on doing something at the VFS level, then it would > >> be nice to at least duplicate the functionality from the first 2 > >> patches. > > Thanks for the pointer. The first two patches look useful and I think we > > can design the ioctl interface to accommodate the NFS usecase (although most > > filesystems will just return EOPNOTSUPP in case someone asks for metadata > > invalidation). > > > > The interface in your patches looks mostly OK, I'd just rather use 'flags' > > than 'cmd'. So something like: > > struct invalidate_arg { > > u64 flags; > > u64 start; > > u64 len; > > }; > > > > Where 'flags' can be: > > INVAL_METADATA - invalidate inode itself, acls, etc. if fs supports it > > (this can be more finegrained if that's useful. Is it?) > > INVAL_DATA - invalidate data in range <start, start+len) > > If these are flags rather than a command, then is the intention that > they can be ORed together so that you can force a metadata+data > revalidate in a single function call? Yes, that is my intention. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html