Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 05, 2017 at 08:14:48AM +1100, NeilBrown wrote:
> On Thu, Jan 05 2017, J. Bruce Fields wrote:
> 
> > I'm not against the patch, but I'm still not convinced by the
> > explanation:
> >
> > On Thu, Dec 22, 2016 at 12:38:06PM -0500, andros@xxxxxxxxxx wrote:
> >> From: Neil Brown <neilb@xxxxxxxx>
> >> 
> >> The rsc cache code operates in a read_lock/write_lock environment.
> >> Changes to a cache entry should use the provided rsc_update
> >> routine which takes the write_lock.
> >
> > It looks pretty suspicious to be setting CACHE_NEGATIVE without the
> > cache_lock for write, but I'm not actually convinced there's a bug there
> > either.  In any case not one that you'd be hitting reliably.
> >
> >> The current code sets the expiry_time and the CACHE_NEGATIVE flag
> >> without taking the write_lock as it does not call rsc_update.
> >> Without this patch, while cache_clean sees the entries to be
> >> removed, it does not remove the rsc_entries. This is because
> >> rsc_update updates other fields such as flush_time and last_refresh
> >> in the entry to trigger cache_clean to reap the entry.
> >
> > I think the root cause of the particular behavior you were seeing was
> > actually an oversight from Neil's c5b29f885afe "sunrpc: use seconds
> > since boot in expiry cache", which missed this one occurrence of
> > get_seconds().  So it's setting the item's entry to something decades in
> > the future.
> >
> > And that's probably not been a huge deal since these entries aren't so
> > big, and they will eventually get cleaned up by cache_purge when the
> > cache is destroyed.  Still, I can imagine it slowing down cache lookups
> > on a long-lived server.
> >
> > The one-liner:
> >
> >  -		rsci->h.expiry_time = get_seconds();
> >  +		rsci->h.expiry_time = seconds_since_boot();
> >
> > would probably also do the job.  Am I missing something?
> 
> I was missing that get_seconds() bug - thanks.
> The other real bug is that setting h.expiry_time backwards should
> really set cd->nextcheck backwards too.  I thought I had found code
> which did that, but I think I was confusing ->nextcheck with ->flush_time.
> 
> >
> > But, OK, I think Neil's patch will ensure entries get cleaned up more
> > quickly than that would, and might also fix a rare race.
> 
> Yes.  The patch doesn't just fix the bug, whatever it is.  It provides a
> proper interface for functionality that wasn't previously supported, and
> so had been hacked into place.

Got it.  So, with another changelog rewrite, I'm applying it like the
below.

Another question is whether it's worth a stable cc....  I think so, as
all it would take is a case where a server gets a lot of PROC_DESTROYs
over its lifetime.  That's not hard to imagine.  And the symptoms are
probably non-obvious and cured by a reboot, which might explain it not
having seen bug reports.

--b.

commit 27297f41527d
Author: Neil Brown <neilb@xxxxxxxx>
Date:   Thu Dec 22 12:38:06 2016 -0500

    svcauth_gss: free context promptly on PROC_DESTROY
    
    Since c5b29f885afe "sunrpc: use seconds since boot in expiry cache", the
    expiry_time field has been in units of seconds since boot, but that
    patch overlooked the server code that handles RPC_GSS_PROC_DESTROY
    requests.  The result is that when we get a request from a client to
    destroy a context, we set that context's expiry time to a time decades
    in the future.
    
    The context will still be cleaned up by cache_purge when the module is
    unloaded or the container shut down, but a lot of contexts could pile up
    before then.
    
    The simple fix would be just to set expiry_time to seconds_since_boot(),
    but modifying the entry outside the cache_lock is also looks dubious
    (though I'm not completely sure what the race would be).  So let's
    provide a helper that does this properly, taking the lock and removing
    the entry immediately.
    
    Signed-off-by: Neil Brown <neilb@xxxxxxxx>
    Reported-by: Andy Adamson <andros@xxxxxxxxxx>
    Signed-off-by: Andy Adamson <andros@xxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx>

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 62a60eeacb0a..9dcf2c8fe1c5 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -227,6 +227,7 @@ extern void sunrpc_destroy_cache_detail(struct cache_detail *cd);
 extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
 					umode_t, struct cache_detail *);
 extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
+extern void sunrpc_cache_unhash(struct cache_detail *, struct cache_head *);
 
 /* Must store cache_detail in seq_file->private if using next three functions */
 extern void *cache_seq_start(struct seq_file *file, loff_t *pos);
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 886e9d381771..a54a7a3d28f5 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1489,8 +1489,8 @@ svcauth_gss_accept(struct svc_rqst *rqstp, __be32 *authp)
 	case RPC_GSS_PROC_DESTROY:
 		if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
 			goto auth_err;
-		rsci->h.expiry_time = get_seconds();
-		set_bit(CACHE_NEGATIVE, &rsci->h.flags);
+		/* Delete the entry from the cache_list and call cache_put */
+		sunrpc_cache_unhash(sn->rsc_cache, &rsci->h);
 		if (resv->iov_len + 4 > PAGE_SIZE)
 			goto drop;
 		svc_putnl(resv, RPC_SUCCESS);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 8147e8d56eb2..502b9fe9817b 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1855,3 +1855,15 @@ void sunrpc_cache_unregister_pipefs(struct cache_detail *cd)
 }
 EXPORT_SYMBOL_GPL(sunrpc_cache_unregister_pipefs);
 
+void sunrpc_cache_unhash(struct cache_detail *cd, struct cache_head *h)
+{
+	write_lock(&cd->hash_lock);
+	if (!hlist_unhashed(&h->cache_list)){
+		hlist_del_init(&h->cache_list);
+		cd->entries--;
+		write_unlock(&cd->hash_lock);
+		cache_put(h, cd);
+	} else
+		write_unlock(&cd->hash_lock);
+}
+EXPORT_SYMBOL_GPL(sunrpc_cache_unhash);
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux