Re: [PATCH] sunrpc/cache: skip checking the pending cache expired

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 05 2017, Kinglong Mee wrote:

> On 2/5/2017 18:37, Kinglong Mee wrote:
>> On 2/4/2017 13:41, Kinglong Mee wrote:
>>> When the first time pynfs runs after rpc/nfsd startup, always get the warning, 
>>>
>>> Got error: Connection closed
>>>
>>> Commit 778620364ef5 "sunrpc/cache: make cache flushing more reliable." lets
>>> cache_is_expired() checking expired as, 
>>>
>>> return (h->expiry_time < seconds_since_boot()) ||
>>>         (detail->flush_time >= h->last_refresh);
>>>
>>> The detail->flush_time is equal to h->last_refresh time, when the cache is
>>> new created and then do the upcall immediately. So that, the cache will be
>>> treated as expired and be cleaned when write_flush().
>
> I found the problem is caused by, 
> 1. A new startup of nfsd, rpc.mountd, etc,
> 2. A rpc request from client (pynfs test, or normal mounting),
> 3. An ip_map cache is created but invalid, so upcall to rpc.mountd, 
> 4. rpc.mountd process the ip_map upcall, before write the valid data to nfsd,
>    do auth_reload(), and check_useipaddr(),
> 5. For the first time, old_use_ipaddr = -1, that cause a cache_flush,
> 6. The ip_map cache will be treat as expired and clean,
> 7. When rpc.mountd write the valid data to nfsd, a new ip_map is created
>    and updated, the cache_check of old ip_map(doing the upcall) will
>    return -ETIMEDOUT.
> 8. RPC layer return SVC_CLOSE and close the xprt after commit 4d712ef1db05
>    "svcauth_gss: Close connection when dropping an incoming message"
>
> Although it is exist only one time, but I don't like the noise.
> Is the cache_flush for old_use_ipaddr = -1 necessary in step 5?
> I have no idea about how to fix it, any suggests?
>

Hi,
 I think that your idea that CACHE_PENDING cache entries shouldn't be
 treated as expired make some sense.
 I would actually focus on CACHE_VALID.  If CACHE_VALID is not set, then
 there is no data in the cache item, so there is nothing to expire.
 So it would be nice if cache items that don't have CACHE_VALID are
 never treated as expired.
 However, as you discovered, that causes a problem for cache_purge().
 It purges a cache by setting the ->flush_time, and calling
 cache_clean(), which only removes expired entries.  For that to work,
 Even non-CACHE_VALID entries need to be seen as expired.

 I think the best thing to do is to write a stand-alone cache_purge().
 Rather than using cache_flush(), explicitly free every item on every
 hash chain.

 Something like this?

static int cache_purge(struct cache_detail *cd)
{
	int rv = 0;
	struct list_head *next;
	int i;

	write_lock(&cd->hash_lock);

	for (i = 0; i < cd->hash_size; i++) {
		struct cache_head *ch = NULL;
		struct cache_detail *d;
		struct hlist_head *head;
		struct hlist_node *tmp;

		head = &cd->hash_table[i];
		hlist_for_each_entry_safe(ch, tmp, head, cache_list) {
			hlist_del_init(&ch->cache_list);
			cd->entries--;

			set_bit(CACHE_CLEANED, &ch->flags);
			write_unlock(&cd->hash_lock);
			cache_fresh_unlocked(ch, d);
			cache_put(ch, d);
			write_lock(&cd->hash_lock);
		}
	}
	write_unlock(&cd->hash_lock);

	return rv;
}

Thanks,
NeilBrown


> thanks,
> Kinglong Mee
>
>>>
>>> This patch skips checking the pending cache expired when doing upcall.
>>>
>>> Signed-off-by: Kinglong Mee <kinglongmee@xxxxxxxxx>
>>> ---
>>>  include/linux/sunrpc/cache.h | 5 ++++-
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
>>> index 62a60ee..9961c1f 100644
>>> --- a/include/linux/sunrpc/cache.h
>>> +++ b/include/linux/sunrpc/cache.h
>>> @@ -204,8 +204,11 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
>>>  	kref_put(&h->ref, cd->cache_put);
>>>  }
>>>  
>>> -static inline int cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>>> +static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>>>  {
>>> +	if (test_bit(CACHE_PENDING, &h->flags))
>>> +		return false;
>>> +
>> 
>> Sorry, with this patch there is a bug exist,
>> When a pending cache exist, sunrpc_destroy_cache_detail() cann't
>> cleanup the cache_detail, but user always free it later, 
>> that will cause a NULL reference.
>> 
>> Please ignore this patch, maybe another patch for the above problem
>>  will be sent.
>> 
>> thanks,
>> Kinglong Mee
>> 
>>>  	return  (h->expiry_time < seconds_since_boot()) ||
>>>  		(detail->flush_time >= h->last_refresh);
>>>  }
>>>
>> 

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux