On Mon, Dec 12, 2016 at 10:54:33PM +0000, Adamson, Andy wrote: > The bug is setting new values on an rsc cache entry without calling > rsc_update. When you do this, the “local copy” of the rsc cache entry > (e.g. the one returned by gss_svc_searchbyctx ) gets the new values > (expiry_time and CACHE_NEGATIVE bit set) but the new values are not > propagated back to the cache. gss_svc_searchbyctx: found = rsc_lookup(cd, &rsci); ... return found; rsc_lookup: ch = sunrpc_cache_lookup(cd, &item->h, hash); if (ch) return container_of(ch, struct rsc, h); sunrpc_cache_lookup: head = &detail->hash_table[hash]; ... hlist_for_each_entry(tmp, head, cache_list) { ... return tmp; Definitely looks to me like it's returning a cache entry, not a copy. Maybe there's some other reason we need to use rsc_update, but that's not it. --b. > So the next time the cache entry is > looked up, e.g. when cache_clean() is called to clean up, the > expiry_time and CACHE_NEGATIVE are not set to the new values on the > cache entry to be destroyed, and cache_clean does not reap the cache > entry. > > The fix is to do what this patch does, and call rsc_update on the rsc entry. With this patch, cache_clean is called and reaps the cache entries. > > BTW: just look at all the other uses of the cache and you will note that they all call update after setting new values. > > It’s just how Neil’s cache code works. > > e.g. dns_resolve.c > key.h.expiry_time = ttl + seconds_since_boot(); > … > if (key.addrlen == 0) > set_bit(CACHE_NEGATIVE, &key.h.flags); > > item = nfs_dns_update(cd, &key, item); > > > > I also just found a bug, I need a local rsc cache pointer for the rsc_update return. That plus Anna’s comments will be addressed in version 3. I’ll explain more in the patch comments. > > →Andy > > On 12/12/16, 4:58 PM, "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > > On Mon, Dec 12, 2016 at 12:08:49PM -0500, andros@xxxxxxxxxx wrote: > > From: Andy Adamson <andros@xxxxxxxxxx> > > > > The current code sets the expiry_time on the local copy of the rsc > > cache entry - but not on the actual cache entry. > > I'm not following. It looks to me like "rsci" in the below was returned > from gss_svc_searchbyctx(), which was returned in turn from > sunrpc_cache_lookup(), which is returning an item from the rsc cache--I > don't see any copying. > > > Note that currently, the rsc cache entries are not cleaned up even > > when nfsd is stopped. > > So, that sounds like a bug, but I don't understand this explanation yet. > > > Update the cache with the new expiry_time of now so that cache_clean will > > clean up (free) the context to be destroyed. > > > > Signed-off-by: Andy Adamson <andros@xxxxxxxxxx> > > --- > > net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++++++-- > > 1 file changed, 30 insertions(+), 2 deletions(-) > > > > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c > > index 45662d7..6033389 100644 > > --- a/net/sunrpc/auth_gss/svcauth_gss.c > > +++ b/net/sunrpc/auth_gss/svcauth_gss.c > > @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {} > > > > #endif /* CONFIG_PROC_FS */ > > > > +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp) > > +{ > > + struct rsc new; > > + int ret = -ENOMEM; > > + > > + memset(&new, 0, sizeof(struct rsc)); > > + if (dup_netobj(&new.handle, &rscp->handle)) > > + goto out; > > + new.h.expiry_time = get_seconds(); > > + set_bit(CACHE_NEGATIVE, &new.h.flags); > > + > > + rscp = rsc_update(sn->rsc_cache, &new, rscp); > > + if (!rscp) > > + goto out; > > + ret = 0; > > +out: > > + rsc_free(&new); > > + return ret; > > +} > > + > > /* > > * Accept an rpcsec packet. > > * If context establishment, punt to user space > > @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {} > > case RPC_GSS_PROC_DESTROY: > > if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq)) > > goto auth_err; > > - rsci->h.expiry_time = get_seconds(); > > - set_bit(CACHE_NEGATIVE, &rsci->h.flags); > > + if (rsc_destroy(sn, rsci)) > > + goto drop; > > + /** > > + * Balance the cache_put at the end of svcauth_gss_accept.This > > + * will leave the refcount = 1 so that the cache_clean cache_put > > + * will call rsc_put. > > + */ > > I'm confused by that comment. If it's right, then it means the refcount > is currently zero, in which case the following line is unsafe. > > --b. > > > + cache_get(&rsci->h); > > + > > if (resv->iov_len + 4 > PAGE_SIZE) > > goto drop; > > + > > svc_putnl(resv, RPC_SUCCESS); > > goto complete; > > case RPC_GSS_PROC_DATA: > > -- > > 1.8.3.1 > > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html