Re: [PATCH 0/2] NFS: limit use of ACCESS cache for negative responses

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 04 Sep 2022, Trond Myklebust wrote:
> On Sat, 2022-09-03 at 19:57 +1000, NeilBrown wrote:
> > The whole point of the NFS ACCESS request is that the client cannot
> > know
> > how to interpret any access controls that the server might have in
> > place.  Who knows, they might even be time based access controls
> > (games can only be played on the weekend).  Ok, that is a bit
> > extreme,
> > but the principle remains.
> > 
> > *You* might "know" that there are two pieces (in this specific
> > example)
> > and that one doesn't directly relate to the inodes, but the NFS
> > client
> > doesn't know that.  The NFS client only knows that it can request
> > ACCESS
> > information for a filehandle in the context of a credential.  The
> > Linux
> > NFS client then caches that against the inode (so it is all somehow
> > related to the inode).
> > 
> > The spec doesn't give the client permission to cached these details
> > AT
> > ALL.  Caching positive results makes perfect sense from a performance
> > perspective and is easily defensible.  Caching negative results ... 
> > not
> > so much.
> 
> Of course the spec allows the client to cache permissions in both NFSv3
> and NFSv4. Otherwise there would be no way we could cache objects at
> all. Enabling the client to gate access to cached data such as
> directory contents, and cached data without troubling the server is the
> whole point of the ACCESS operation. Without that ability, we would
> have to ask the server every time a user tried to 'cd' into a
> directory, called open(), called getdents(), called read(), called
> write(), ... Without that ability, NFSv4 delegations would be
> pointless, because we'd have to go and ask the server for all these
> operations.
> I know the spec is littered with mealy mouthed "The server can revoke
> access permission at any time" statements in various spots, but that
> looks more like something to placate a review by the IBM legal team
> (yes, it happened) rather than a basic assumption that was engineered
> into the protocol. The protocol simply cannot work as described in
> RFC5661 and RFC7530 without the assumption that you can cache.

It isn't explicit though is it?  Not like the caching of data which is
explicit.
So I agree the reading between the lines tells us that some caching is
needed and so permitted, but that doesn't mean that any and all caching
of ACCESS results is good. We should cache what we need, and no more.

> 
> The other point is that access cache changes due to AD/LDAP/NIS group
> membership updates happen once in a blue moon, and when they do, they
> affect _ALL_ cached values for that user, and not just one at a time.
> When those things do happen, the user normally expects to have to log
> out and then log back in so that the new group memberships are
> reflected in that user's cred (as per POSIX prescribed behaviour). That
> will cause the NFS client to send new ACCESS calls to the server,
> because the new cred will not be found in the existing access cache.
> The only case where this assumption fails, is if the server is
> implementing the '--manage-gids' behaviour, and caches its
> representation of the user's creds past the time when the user logged
> out and logged in again.
> So this is literally a 1 in a billion chance problem that you're asking
> us to fix. Optimising for that case by adding in a timeout that affects
> all cached access value is just not in the cards.

I'll accept 1 in several million (only two enterprises identified out of
how many that use Linux NFS clients?).  Obviously that doesn't justify
anything potentially costly.

I don't think there is any interesting cost in not caching negative
access results beyond the normal attribute-cache timeout.  I don't
recall you addressing this particular issue.  What is the cost?

> 
> I've already sent out a patch that forces realignment of the NFS client
> behaviour with that of POSIX by requiring the client to recheck cached
> access values for an existing credential after a new login, but does
> not add any further impediments to the access cache. Why is that
> approach not sufficient?

I think you've observed this fact yourself, but propagation times for
group membership changes (e.g.  though LDAP) are not uniform.  It is
entirely possible for the client to see a new membership before the
server.  So with --manage-gids (or krb5) it is possible to log in to the
client, check that group membership has been updated, access a file
and have that fail.  But then a minute later (if you could flush the
access cache) have that access succeed.

When I was first presented with this problem I thought logout/login was
the answer and even provided a patch which would mean different logins
got different cache keys.  This wasn't very well received because it
used to work without log out/in (because we didn't cache access
indefinitely), but the killer was that it didn't even work reliably -
because of propagation variability.

Thanks,
NeilBrown





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux