Re: libcephfs invalidate upcalls

Matt Benjamin <mbenjamin@xxxxxxxxxx> · Mon, 28 Sep 2015 10:03:55 -0400 (EDT)

Hi,

-- 
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-761-4689
fax.  734-769-8938
cel.  734-216-5309

----- Original Message -----
> From: "John Spray" <jspray@xxxxxxxxxx>
> To: "Matt Benjamin" <mbenjamin@xxxxxxxxxx>
> Cc: "Ceph Development" <ceph-devel@xxxxxxxxxxxxxxx>
> Sent: Monday, September 28, 2015 9:01:28 AM
> Subject: Re: libcephfs invalidate upcalls
> 
> On Sat, Sep 26, 2015 at 8:03 PM, Matt Benjamin <mbenjamin@xxxxxxxxxx> wrote:
> > Hi John,
> >
> > I prototyped an invalidate upcall for libcephfs and the Gasesha Ceph fsal,
> > building on the Client invalidation callback registrations.
> >
> > As you suggested, NFS (or AFS, or DCE) minimally expect a more generic
> > "cached vnode may have changed" trigger than the current inode and dentry
> > invalidates, so I extended the model slightly to hook cap revocation,
> > feedback appreciated.
> 
> In cap_release, we probably need to be a bit more discriminating about
> when to drop, e.g. if we've only lost our exclusive write caps, the
> rest of our metadata might all still be fine to cache.  Is ganesha in
> general doing any data caching?  I think I had implicitly assumed that
> we were only worrying about metadata here but now I realise I never
> checked that.

Ganesha isn't currently, though it did once, and is likely to again, at some point.

The exclusive write cap is in fact something with a direct mapping to NFSv4 delegations,
so we do want to be able to trigger a recall, in this case.

> 
> The awkward part is Client::trim_caps.  In the Client::trim_caps case,
> the lru_is_expirable part won't be true until something has already
> been invalidated, so there needs to be an explicit hook there --
> rather than invalidating in response to cap release, we need to
> invalidate in order to get ganesha to drop its handle, which will
> render something expirable, and finally when we expire it, the cap
> gets released.

Ok, sure.

> 
> In that case maybe we need a hook in ganesha to say "invalidate
> everything you can" so that we don't have to make a very large number
> of function calls to invalidate things.  In the fuse/kernel case we
> can only sometimes invalidate a piece of metadata (e.g. we can't if
> its flocked or whatever), so we ask it to invalidate everything.  But
> perhaps in the NFS case we can always expect our invalidate calls to
> be respected, so we could just invalidate a smaller number of things
> (the difference between actual cache size and desired)?

As you noted above, what we're invalidating a cache entry.  With Dan's
mdcache work, we might no longer be caching at the Ganesha level, but
I didn't assume that here.

Matt

> 
> John
> 
> >
> > git@xxxxxxxxxx:linuxbox2/ceph.git , branch invalidate
> > git@xxxxxxxxxx:linuxbox2/nfs-ganesha.git , branch ceph-invalidates
> >
> > thanks,
> >
> > Matt
> >
> > --
> > Matt Benjamin
> > Red Hat, Inc.
> > 315 West Huron Street, Suite 140A
> > Ann Arbor, Michigan 48103
> >
> > http://www.redhat.com/en/technologies/storage
> >
> > tel.  734-761-4689
> > fax.  734-769-8938
> > cel.  734-216-5309
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html