Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 05 2017, J. Bruce Fields wrote:

> I'm not against the patch, but I'm still not convinced by the
> explanation:
>
> On Thu, Dec 22, 2016 at 12:38:06PM -0500, andros@xxxxxxxxxx wrote:
>> From: Neil Brown <neilb@xxxxxxxx>
>> 
>> The rsc cache code operates in a read_lock/write_lock environment.
>> Changes to a cache entry should use the provided rsc_update
>> routine which takes the write_lock.
>
> It looks pretty suspicious to be setting CACHE_NEGATIVE without the
> cache_lock for write, but I'm not actually convinced there's a bug there
> either.  In any case not one that you'd be hitting reliably.
>
>> The current code sets the expiry_time and the CACHE_NEGATIVE flag
>> without taking the write_lock as it does not call rsc_update.
>> Without this patch, while cache_clean sees the entries to be
>> removed, it does not remove the rsc_entries. This is because
>> rsc_update updates other fields such as flush_time and last_refresh
>> in the entry to trigger cache_clean to reap the entry.
>
> I think the root cause of the particular behavior you were seeing was
> actually an oversight from Neil's c5b29f885afe "sunrpc: use seconds
> since boot in expiry cache", which missed this one occurrence of
> get_seconds().  So it's setting the item's entry to something decades in
> the future.
>
> And that's probably not been a huge deal since these entries aren't so
> big, and they will eventually get cleaned up by cache_purge when the
> cache is destroyed.  Still, I can imagine it slowing down cache lookups
> on a long-lived server.
>
> The one-liner:
>
>  -		rsci->h.expiry_time = get_seconds();
>  +		rsci->h.expiry_time = seconds_since_boot();
>
> would probably also do the job.  Am I missing something?

I was missing that get_seconds() bug - thanks.
The other real bug is that setting h.expiry_time backwards should
really set cd->nextcheck backwards too.  I thought I had found code
which did that, but I think I was confusing ->nextcheck with ->flush_time.

>
> But, OK, I think Neil's patch will ensure entries get cleaned up more
> quickly than that would, and might also fix a rare race.

Yes.  The patch doesn't just fix the bug, whatever it is.  It provides a
proper interface for functionality that wasn't previously supported, and
so had been hacked into place.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux