On 05/30/2012 03:32 PM, Anand Avati wrote: > Brian, > You are right, today we hardly leverage the page cache in the kernel. > When Gluster started and performance translators were implemented, the > fuse invalidation support did not exist, and since that support was > brought in upstream fuse we haven't leveraged that effectively. We can > actually do a lot more smart things using the invalidation changes. > > For the consistency concerns where an open fd continues to refer to > local page cache - if that is a problem, today you need to mount with > --enable-direct-io-mode to bypass the page cache altogether (this is > very different from O_DIRECT open() support). On the other hand, to > utilize the fuse invalidation APIs and promote using the page cache and > still be consistent, we need to gear up glusterfs framework by first > implementing server originated messaging support, then build some kind > of opportunistic locking or leases to notify glusterfs clients about > modifications from a second client, and third implement hooks in the > client side listener to do things like sending fuse invalidations or > purge pages in io-cache or flush pending writes in write-behind etc. > This needs to happen, but we're short on resources to prioritize this > sooner :-) > Thanks for the context Avati. The fuse patch I sent lead to a similar thought process with regard to finer grained invalidation. So far it seems well received, and as I understand it, we can also utilize that mechanism to do full invalidations from gluster on older fuse modules that wouldn't have that fix. I'll look into incorporating that into what I have so far and making it available for review. Brian > Avati > > On Wed, May 30, 2012 at 8:16 AM, Brian Foster <bfoster@xxxxxxxxxx > <mailto:bfoster@xxxxxxxxxx>> wrote: > > Hi all, > > I've been playing with a little hack recently to add a gluster mount > option to support FOPEN_KEEP_CACHE and I wanted to solicit some thoughts > on whether there's value to find an intelligent way to support this > functionality. To provide some context: > > Our current behavior with regard to fuse is that page cache is utilized > by fuse, from what I can tell, just about in the same manner as a > typical local fs. The primary difference is that by default, the address > space mapping for an inode is completely invalidated on open. So for > example, if process A opens and reads a file in a loop, subsequent reads > are served from cache (bypassing fuse and gluster). If process B steps > in and opens the same file, the cache is flushed and the next reads from > either process are passed down through fuse. The FOPEN_KEEP_CACHE option > simply disables this cache flash on open behavior. > > The following are some notes on my experimentation thus far: > > - With FOPEN_KEEP_CACHE, fuse currently only invalidates on file size > changes. This is a problem in that I can rewrite some or all of a file > from another client and the cached client wouldn't notice. I've sent a > patch to fuse-devel to also invalidate on mtime changes (similar to > nfsv3 or cifs), so we'll see how well that is received. fuse also > supports a range based invalidation notification that we could take > advantage of if necessary. > > - I reproduce a measurable performance benefit in the local/cached read > situation. For example, running a kernel compile against a source tree > in a gluster volume (no other xlators and build output to local storage) > improves to 6 minutes from just under 8 minutes with the default graph > (9.5 minutes with only the client xlator and 1:09 locally). > > - Some of the specific differences from current io-cache caching: > - io-cache supports time based invalidation and tunables such > as cache > size and priority. The page cache has no such controls. > - io-cache invalidates more frequently on various fops. It > also looks > like we invalidate on writes and don't take advantage of the write data > most recently sent, whereas page cache writes are cached (errors > notwithstanding). > - Page cache obviously has tighter integration with the > system (i.e., > drop_caches controls, more specific reporting, ability to drop cache > when memory is needed). > > All in all, I'm curious what people think about enabling the cache > behavior in gluster. We could support anything from the basic mount > option I'm currently using (i.e., similar to attribute/dentry caching) > to something integrated with io-cache (doing invalidations when > necessary), or maybe even something eventually along the lines of the > nfs weak cache consistency model where it validates the cache after > every fop based on file attributes. > > In general, are there other big issues/questions that would need to be > explored before this is useful (i.e., the size invalidation issue)? Are > there other performance tests that should be explored? Thoughts > appreciated. Thanks. > > Brian > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxx> > https://lists.nongnu.org/mailman/listinfo/gluster-devel > >