Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Oct 28, 2022, at 4:13 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> 
> On Fri, 2022-10-28 at 19:49 +0000, Chuck Lever III wrote:
>> 
>>> On Oct 28, 2022, at 2:57 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>>> 
>>> The filecache refcounting is a bit non-standard for something searchable
>>> by RCU, in that we maintain a sentinel reference while it's hashed. This
>>> in turn requires that we have to do things differently in the "put"
>>> depending on whether its hashed, which we believe to have led to races.
>>> 
>>> There are other problems in here too. nfsd_file_close_inode_sync can end
>>> up freeing an nfsd_file while there are still outstanding references to
>>> it, and there are a number of subtle ToC/ToU races.
>>> 
>>> Rework the code so that the refcount is what drives the lifecycle. When
>>> the refcount goes to zero, then unhash and rcu free the object.
>>> 
>>> With this change, the LRU carries a reference. Take special care to
>>> deal with it when removing an entry from the list.
>> 
>> I can see a way of making this patch a lot cleaner. It looks like there's
>> a fair bit of renaming and moving of functions -- that can go in clean
>> up patches before doing the heavy lifting.
>> 
> 
> Is this something that's really needed? I'm already basically rewriting
> this code. Reshuffling the old code around first will take a lot of time
> and we'll still end up with the same result.

I did exactly this for the nfs4_file rhash changes. It took just a couple
of hours. The outcome is that you can see exactly, in the final patch in
that series, how the file_hashtbl -> rhltable substitution is done.

Making sure each of the changes is more or less mechanical and obvious
is a good way to ensure no-one is doing something incorrect. That's why
folks like to use cocchinelle.

Trust me, it will be much easier to figure out in a year when we have
new bugs in this code if we split up this commit just a little.


>> I'm still not sold on the idea of a synchronous flush in nfsd_file_free().
> 
> I think that we need to call this there to ensure that writeback errors
> are handled. I worry that if try to do this piecemeal, we could end up
> missing errors when they fall off the LRU.
> 
>> That feels like a deadlock waiting to happen and quite difficult to
>> reproduce because I/O there is rarely needed. It could help to put a
>> might_sleep() in nfsd_file_fsync(), at least, but I would prefer not to
>> drive I/O in that path at all.
> 
> I don't quite grok the potential for a deadlock here. nfsd_file_free
> already has to deal with blocking activities due to it effective doing a
> close(). This is just another one. That's why nfsd_file_put has a
> might_sleep in it (to warn its callers).

Currently nfsd_file_put() calls nfsd_file_flush(), which calls
vfs_fsync(). That can't be called while holding a spinlock.


> What's the deadlock scenario you envision?

OK, filp_close() does call f_op->flush(). So we have this call
here and there aren't problems today. I still say this is a
problem waiting to occur, but I guess I can live with it.

If filp_close() already calls f_op->flush(), why do we need an
explicit vfs_fsync() there?


>>> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
>>> ---
>>> fs/nfsd/filecache.c | 357 ++++++++++++++++++++++----------------------
>>> fs/nfsd/trace.h     |   5 +-
>>> 2 files changed, 178 insertions(+), 184 deletions(-)
>>> 
>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>> index f8ebbf7daa18..d928c5e38eeb 100644
>>> --- a/fs/nfsd/filecache.c
>>> +++ b/fs/nfsd/filecache.c
>>> @@ -1,6 +1,12 @@
>>> // SPDX-License-Identifier: GPL-2.0
>>> /*
>>> * The NFSD open file cache.
>>> + *
>>> + * Each nfsd_file is created in response to client activity -- either regular
>>> + * file I/O for v2/v3, or opening a file for v4. Files opened via v4 are
>>> + * cleaned up as soon as their refcount goes to 0.  Entries for v2/v3 are
>>> + * flagged with NFSD_FILE_GC. On their last put, they are added to the LRU for
>>> + * eventual disposal if they aren't used again within a short time period.
>>> */
>>> 
>>> #include <linux/hash.h>
>>> @@ -301,55 +307,22 @@ nfsd_file_alloc(struct nfsd_file_lookup_key *key, unsigned int may)
>>> 		if (key->gc)
>>> 			__set_bit(NFSD_FILE_GC, &nf->nf_flags);
>>> 		nf->nf_inode = key->inode;
>>> -		/* nf_ref is pre-incremented for hash table */
>>> -		refcount_set(&nf->nf_ref, 2);
>>> +		refcount_set(&nf->nf_ref, 1);
>>> 		nf->nf_may = key->need;
>>> 		nf->nf_mark = NULL;
>>> 	}
>>> 	return nf;
>>> }
>>> 
>>> -static bool
>>> -nfsd_file_free(struct nfsd_file *nf)
>>> -{
>>> -	s64 age = ktime_to_ms(ktime_sub(ktime_get(), nf->nf_birthtime));
>>> -	bool flush = false;
>>> -
>>> -	this_cpu_inc(nfsd_file_releases);
>>> -	this_cpu_add(nfsd_file_total_age, age);
>>> -
>>> -	trace_nfsd_file_put_final(nf);
>>> -	if (nf->nf_mark)
>>> -		nfsd_file_mark_put(nf->nf_mark);
>>> -	if (nf->nf_file) {
>>> -		get_file(nf->nf_file);
>>> -		filp_close(nf->nf_file, NULL);
>>> -		fput(nf->nf_file);
>>> -		flush = true;
>>> -	}
>>> -
>>> -	/*
>>> -	 * If this item is still linked via nf_lru, that's a bug.
>>> -	 * WARN and leak it to preserve system stability.
>>> -	 */
>>> -	if (WARN_ON_ONCE(!list_empty(&nf->nf_lru)))
>>> -		return flush;
>>> -
>>> -	call_rcu(&nf->nf_rcu, nfsd_file_slab_free);
>>> -	return flush;
>>> -}
>>> -
>>> -static bool
>>> -nfsd_file_check_writeback(struct nfsd_file *nf)
>>> +static void
>>> +nfsd_file_fsync(struct nfsd_file *nf)
>>> {
>>> 	struct file *file = nf->nf_file;
>>> -	struct address_space *mapping;
>>> 
>>> 	if (!file || !(file->f_mode & FMODE_WRITE))
>>> -		return false;
>>> -	mapping = file->f_mapping;
>>> -	return mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) ||
>>> -		mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK);
>>> +		return;
>>> +	if (vfs_fsync(file, 1) != 0)
>>> +		nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id));
>>> }
>>> 
>>> static int
>>> @@ -362,30 +335,6 @@ nfsd_file_check_write_error(struct nfsd_file *nf)
>>> 	return filemap_check_wb_err(file->f_mapping, READ_ONCE(file->f_wb_err));
>>> }
>>> 
>>> -static void
>>> -nfsd_file_flush(struct nfsd_file *nf)
>>> -{
>>> -	struct file *file = nf->nf_file;
>>> -
>>> -	if (!file || !(file->f_mode & FMODE_WRITE))
>>> -		return;
>>> -	if (vfs_fsync(file, 1) != 0)
>>> -		nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id));
>>> -}
>>> -
>>> -static void nfsd_file_lru_add(struct nfsd_file *nf)
>>> -{
>>> -	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
>>> -	if (list_lru_add(&nfsd_file_lru, &nf->nf_lru))
>>> -		trace_nfsd_file_lru_add(nf);
>>> -}
>>> -
>>> -static void nfsd_file_lru_remove(struct nfsd_file *nf)
>>> -{
>>> -	if (list_lru_del(&nfsd_file_lru, &nf->nf_lru))
>>> -		trace_nfsd_file_lru_del(nf);
>>> -}
>>> -
>>> static void
>>> nfsd_file_hash_remove(struct nfsd_file *nf)
>>> {
>>> @@ -408,53 +357,66 @@ nfsd_file_unhash(struct nfsd_file *nf)
>>> }
>>> 
>>> static void
>>> -nfsd_file_unhash_and_dispose(struct nfsd_file *nf, struct list_head *dispose)
>>> +nfsd_file_free(struct nfsd_file *nf)
>>> {
>>> -	trace_nfsd_file_unhash_and_dispose(nf);
>>> -	if (nfsd_file_unhash(nf)) {
>>> -		/* caller must call nfsd_file_dispose_list() later */
>>> -		nfsd_file_lru_remove(nf);
>>> -		list_add(&nf->nf_lru, dispose);
>>> +	s64 age = ktime_to_ms(ktime_sub(ktime_get(), nf->nf_birthtime));
>>> +
>>> +	trace_nfsd_file_free(nf);
>>> +
>>> +	this_cpu_inc(nfsd_file_releases);
>>> +	this_cpu_add(nfsd_file_total_age, age);
>>> +
>>> +	nfsd_file_unhash(nf);
>>> +	nfsd_file_fsync(nf);
>>> +
>>> +	if (nf->nf_mark)
>>> +		nfsd_file_mark_put(nf->nf_mark);
>>> +	if (nf->nf_file) {
>>> +		get_file(nf->nf_file);
>>> +		filp_close(nf->nf_file, NULL);
>>> +		fput(nf->nf_file);
>>> 	}
>>> +
>>> +	/*
>>> +	 * If this item is still linked via nf_lru, that's a bug.
>>> +	 * WARN and leak it to preserve system stability.
>>> +	 */
>>> +	if (WARN_ON_ONCE(!list_empty(&nf->nf_lru)))
>>> +		return;
>>> +
>>> +	call_rcu(&nf->nf_rcu, nfsd_file_slab_free);
>>> }
>>> 
>>> -static void
>>> -nfsd_file_put_noref(struct nfsd_file *nf)
>>> +static bool
>>> +nfsd_file_check_writeback(struct nfsd_file *nf)
>>> {
>>> -	trace_nfsd_file_put(nf);
>>> +	struct file *file = nf->nf_file;
>>> +	struct address_space *mapping;
>>> 
>>> -	if (refcount_dec_and_test(&nf->nf_ref)) {
>>> -		WARN_ON(test_bit(NFSD_FILE_HASHED, &nf->nf_flags));
>>> -		nfsd_file_lru_remove(nf);
>>> -		nfsd_file_free(nf);
>>> -	}
>>> +	if (!file || !(file->f_mode & FMODE_WRITE))
>>> +		return false;
>>> +	mapping = file->f_mapping;
>>> +	return mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) ||
>>> +		mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK);
>>> }
>>> 
>>> -static void
>>> -nfsd_file_unhash_and_put(struct nfsd_file *nf)
>>> +static bool nfsd_file_lru_add(struct nfsd_file *nf)
>>> {
>>> -	if (nfsd_file_unhash(nf))
>>> -		nfsd_file_put_noref(nf);
>>> +	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
>>> +	if (list_lru_add(&nfsd_file_lru, &nf->nf_lru)) {
>>> +		trace_nfsd_file_lru_add(nf);
>>> +		return true;
>>> +	}
>>> +	return false;
>>> }
>>> 
>>> -void
>>> -nfsd_file_put(struct nfsd_file *nf)
>>> +static bool nfsd_file_lru_remove(struct nfsd_file *nf)
>>> {
>>> -	might_sleep();
>>> -
>>> -	if (test_bit(NFSD_FILE_GC, &nf->nf_flags))
>>> -		nfsd_file_lru_add(nf);
>>> -	else if (refcount_read(&nf->nf_ref) == 2)
>>> -		nfsd_file_unhash_and_put(nf);
>>> -
>>> -	if (!test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
>>> -		nfsd_file_flush(nf);
>>> -		nfsd_file_put_noref(nf);
>>> -	} else if (nf->nf_file && test_bit(NFSD_FILE_GC, &nf->nf_flags)) {
>>> -		nfsd_file_put_noref(nf);
>>> -		nfsd_file_schedule_laundrette();
>>> -	} else
>>> -		nfsd_file_put_noref(nf);
>>> +	if (list_lru_del(&nfsd_file_lru, &nf->nf_lru)) {
>>> +		trace_nfsd_file_lru_del(nf);
>>> +		return true;
>>> +	}
>>> +	return false;
>>> }
>>> 
>>> struct nfsd_file *
>>> @@ -465,36 +427,77 @@ nfsd_file_get(struct nfsd_file *nf)
>>> 	return NULL;
>>> }
>>> 
>>> -static void
>>> -nfsd_file_dispose_list(struct list_head *dispose)
>>> +/**
>>> + * nfsd_file_unhash_and_queue - unhash a file and queue it to the dispose list
>>> + * @nf: nfsd_file to be unhashed and queued
>>> + * @dispose: list to which it should be queued
>>> + *
>>> + * Attempt to unhash a nfsd_file and queue it to the given list. Each file
>>> + * will have a reference held on behalf of the list. That reference may come
>>> + * from the LRU, or we may need to take one. If we can't get a reference,
>>> + * ignore it altogether.
>>> + */
>>> +static bool
>>> +nfsd_file_unhash_and_queue(struct nfsd_file *nf, struct list_head *dispose)
>>> {
>>> -	struct nfsd_file *nf;
>>> +	trace_nfsd_file_unhash_and_queue(nf);
>>> +	if (nfsd_file_unhash(nf)) {
>>> +		/*
>>> +		 * If we remove it from the LRU, then just use that
>>> +		 * reference for the dispose list. Otherwise, we need
>>> +		 * to take a reference. If that fails, just ignore
>>> +		 * the file altogether.
>>> +		 */
>>> +		if (!nfsd_file_lru_remove(nf) && !nfsd_file_get(nf))
>>> +			return false;
>>> +		list_add(&nf->nf_lru, dispose);
>>> +		return true;
>>> +	}
>>> +	return false;
>>> +}
>>> 
>>> -	while(!list_empty(dispose)) {
>>> -		nf = list_first_entry(dispose, struct nfsd_file, nf_lru);
>>> -		list_del_init(&nf->nf_lru);
>>> -		nfsd_file_flush(nf);
>>> -		nfsd_file_put_noref(nf);
>>> +/**
>>> + * nfsd_file_put - put the reference to a nfsd_file
>>> + * @nf: nfsd_file of which to put the reference
>>> + *
>>> + * Put a reference to a nfsd_file. In the v4 case, we just put the
>>> + * reference immediately. In the v2/3 case, if the reference would be
>>> + * the last one, the put it on the LRU instead to be cleaned up later.
>>> + */
>>> +void
>>> +nfsd_file_put(struct nfsd_file *nf)
>>> +{
>>> +	trace_nfsd_file_put(nf);
>>> +
>>> +	/*
>>> +	 * The HASHED check is racy. We may end up with the occasional
>>> +	 * unhashed entry on the LRU, but they should get cleaned up
>>> +	 * like any other.
>>> +	 */
>>> +	if (test_bit(NFSD_FILE_GC, &nf->nf_flags) &&
>>> +	    test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
>>> +		/*
>>> +		 * If this is the last reference (nf_ref == 1), then transfer
>>> +		 * it to the LRU. If the add to the LRU fails, just put it as
>>> +		 * usual.
>>> +		 */
>>> +		if (refcount_dec_not_one(&nf->nf_ref) || nfsd_file_lru_add(nf))
>>> +			return;
>>> 	}
>>> +	if (refcount_dec_and_test(&nf->nf_ref))
>>> +		nfsd_file_free(nf);
>>> }
>>> 
>>> static void
>>> -nfsd_file_dispose_list_sync(struct list_head *dispose)
>>> +nfsd_file_dispose_list(struct list_head *dispose)
>>> {
>>> -	bool flush = false;
>>> 	struct nfsd_file *nf;
>>> 
>>> 	while(!list_empty(dispose)) {
>>> 		nf = list_first_entry(dispose, struct nfsd_file, nf_lru);
>>> 		list_del_init(&nf->nf_lru);
>>> -		nfsd_file_flush(nf);
>>> -		if (!refcount_dec_and_test(&nf->nf_ref))
>>> -			continue;
>>> -		if (nfsd_file_free(nf))
>>> -			flush = true;
>>> +		nfsd_file_free(nf);
>>> 	}
>>> -	if (flush)
>>> -		flush_delayed_fput();
>>> }
>>> 
>>> static void
>>> @@ -564,21 +567,8 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
>>> 	struct list_head *head = arg;
>>> 	struct nfsd_file *nf = list_entry(item, struct nfsd_file, nf_lru);
>>> 
>>> -	/*
>>> -	 * Do a lockless refcount check. The hashtable holds one reference, so
>>> -	 * we look to see if anything else has a reference, or if any have
>>> -	 * been put since the shrinker last ran. Those don't get unhashed and
>>> -	 * released.
>>> -	 *
>>> -	 * Note that in the put path, we set the flag and then decrement the
>>> -	 * counter. Here we check the counter and then test and clear the flag.
>>> -	 * That order is deliberate to ensure that we can do this locklessly.
>>> -	 */
>>> -	if (refcount_read(&nf->nf_ref) > 1) {
>>> -		list_lru_isolate(lru, &nf->nf_lru);
>>> -		trace_nfsd_file_gc_in_use(nf);
>>> -		return LRU_REMOVED;
>>> -	}
>>> +	/* We should only be dealing with v2/3 entries here */
>>> +	WARN_ON_ONCE(!test_bit(NFSD_FILE_GC, &nf->nf_flags));
>>> 
>>> 	/*
>>> 	 * Don't throw out files that are still undergoing I/O or
>>> @@ -589,40 +579,30 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
>>> 		return LRU_SKIP;
>>> 	}
>>> 
>>> +	/* If it was recently added to the list, skip it */
>>> 	if (test_and_clear_bit(NFSD_FILE_REFERENCED, &nf->nf_flags)) {
>>> 		trace_nfsd_file_gc_referenced(nf);
>>> 		return LRU_ROTATE;
>>> 	}
>>> 
>>> -	if (!test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
>>> -		trace_nfsd_file_gc_hashed(nf);
>>> -		return LRU_SKIP;
>>> +	/*
>>> +	 * Put the reference held on behalf of the LRU. If it wasn't the last
>>> +	 * one, then just remove it from the LRU and ignore it.
>>> +	 */
>>> +	if (!refcount_dec_and_test(&nf->nf_ref)) {
>>> +		trace_nfsd_file_gc_in_use(nf);
>>> +		list_lru_isolate(lru, &nf->nf_lru);
>>> +		return LRU_REMOVED;
>>> 	}
>>> 
>>> +	/* Refcount went to zero. Unhash it and queue it to the dispose list */
>>> +	nfsd_file_unhash(nf);
>>> 	list_lru_isolate_move(lru, &nf->nf_lru, head);
>>> 	this_cpu_inc(nfsd_file_evictions);
>>> 	trace_nfsd_file_gc_disposed(nf);
>>> 	return LRU_REMOVED;
>>> }
>>> 
>>> -/*
>>> - * Unhash items on @dispose immediately, then queue them on the
>>> - * disposal workqueue to finish releasing them in the background.
>>> - *
>>> - * cel: Note that between the time list_lru_shrink_walk runs and
>>> - * now, these items are in the hash table but marked unhashed.
>>> - * Why release these outside of lru_cb ? There's no lock ordering
>>> - * problem since lru_cb currently takes no lock.
>>> - */
>>> -static void nfsd_file_gc_dispose_list(struct list_head *dispose)
>>> -{
>>> -	struct nfsd_file *nf;
>>> -
>>> -	list_for_each_entry(nf, dispose, nf_lru)
>>> -		nfsd_file_hash_remove(nf);
>>> -	nfsd_file_dispose_list_delayed(dispose);
>>> -}
>>> -
>>> static void
>>> nfsd_file_gc(void)
>>> {
>>> @@ -632,7 +612,7 @@ nfsd_file_gc(void)
>>> 	ret = list_lru_walk(&nfsd_file_lru, nfsd_file_lru_cb,
>>> 			    &dispose, list_lru_count(&nfsd_file_lru));
>>> 	trace_nfsd_file_gc_removed(ret, list_lru_count(&nfsd_file_lru));
>>> -	nfsd_file_gc_dispose_list(&dispose);
>>> +	nfsd_file_dispose_list_delayed(&dispose);
>>> }
>>> 
>>> static void
>>> @@ -657,7 +637,7 @@ nfsd_file_lru_scan(struct shrinker *s, struct shrink_control *sc)
>>> 	ret = list_lru_shrink_walk(&nfsd_file_lru, sc,
>>> 				   nfsd_file_lru_cb, &dispose);
>>> 	trace_nfsd_file_shrinker_removed(ret, list_lru_count(&nfsd_file_lru));
>>> -	nfsd_file_gc_dispose_list(&dispose);
>>> +	nfsd_file_dispose_list_delayed(&dispose);
>>> 	return ret;
>>> }
>>> 
>>> @@ -668,8 +648,11 @@ static struct shrinker	nfsd_file_shrinker = {
>>> };
>>> 
>>> /*
>>> - * Find all cache items across all net namespaces that match @inode and
>>> - * move them to @dispose. The lookup is atomic wrt nfsd_file_acquire().
>>> + * Find all cache items across all net namespaces that match @inode, unhash
>>> + * them, take references and then put them on @dispose if that was successful.
>>> + *
>>> + * The nfsd_file objects on the list will be unhashed, and each will have a
>>> + * reference taken.
>>> */
>>> static unsigned int
>>> __nfsd_file_close_inode(struct inode *inode, struct list_head *dispose)
>>> @@ -687,52 +670,59 @@ __nfsd_file_close_inode(struct inode *inode, struct list_head *dispose)
>>> 				       nfsd_file_rhash_params);
>>> 		if (!nf)
>>> 			break;
>>> -		nfsd_file_unhash_and_dispose(nf, dispose);
>>> -		count++;
>>> +
>>> +		if (nfsd_file_unhash_and_queue(nf, dispose))
>>> +			count++;
>>> 	} while (1);
>>> 	rcu_read_unlock();
>>> 	return count;
>>> }
>>> 
>>> /**
>>> - * nfsd_file_close_inode_sync - attempt to forcibly close a nfsd_file
>>> + * nfsd_file_close_inode - attempt a delayed close of a nfsd_file
>>> * @inode: inode of the file to attempt to remove
>>> *
>>> - * Unhash and put, then flush and fput all cache items associated with @inode.
>>> + * Unhash and put all cache item associated with @inode.
>>> */
>>> -void
>>> -nfsd_file_close_inode_sync(struct inode *inode)
>>> +static unsigned int
>>> +nfsd_file_close_inode(struct inode *inode)
>>> {
>>> -	LIST_HEAD(dispose);
>>> +	struct nfsd_file *nf;
>>> 	unsigned int count;
>>> +	LIST_HEAD(dispose);
>>> 
>>> 	count = __nfsd_file_close_inode(inode, &dispose);
>>> -	trace_nfsd_file_close_inode_sync(inode, count);
>>> -	nfsd_file_dispose_list_sync(&dispose);
>>> +	trace_nfsd_file_close_inode(inode, count);
>>> +	if (count) {
>>> +		while(!list_empty(&dispose)) {
>>> +			nf = list_first_entry(&dispose, struct nfsd_file, nf_lru);
>>> +			list_del_init(&nf->nf_lru);
>>> +			trace_nfsd_file_closing(nf);
>>> +			if (refcount_dec_and_test(&nf->nf_ref))
>>> +				nfsd_file_free(nf);
>>> +		}
>>> +	}
>>> +	return count;
>>> }
>>> 
>>> /**
>>> - * nfsd_file_close_inode - attempt a delayed close of a nfsd_file
>>> + * nfsd_file_close_inode_sync - attempt to forcibly close a nfsd_file
>>> * @inode: inode of the file to attempt to remove
>>> *
>>> - * Unhash and put all cache item associated with @inode.
>>> + * Unhash and put, then flush and fput all cache items associated with @inode.
>>> */
>>> -static void
>>> -nfsd_file_close_inode(struct inode *inode)
>>> +void
>>> +nfsd_file_close_inode_sync(struct inode *inode)
>>> {
>>> -	LIST_HEAD(dispose);
>>> -	unsigned int count;
>>> -
>>> -	count = __nfsd_file_close_inode(inode, &dispose);
>>> -	trace_nfsd_file_close_inode(inode, count);
>>> -	nfsd_file_dispose_list_delayed(&dispose);
>>> +	if (nfsd_file_close_inode(inode))
>>> +		flush_delayed_fput();
>>> }
>>> 
>>> /**
>>> * nfsd_file_delayed_close - close unused nfsd_files
>>> * @work: dummy
>>> *
>>> - * Walk the LRU list and close any entries that have not been used since
>>> + * Walk the LRU list and destroy any entries that have not been used since
>>> * the last scan.
>>> */
>>> static void
>>> @@ -890,7 +880,7 @@ __nfsd_file_cache_purge(struct net *net)
>>> 		while (!IS_ERR_OR_NULL(nf)) {
>>> 			if (net && nf->nf_net != net)
>>> 				continue;
>>> -			nfsd_file_unhash_and_dispose(nf, &dispose);
>>> +			nfsd_file_unhash_and_queue(nf, &dispose);
>>> 			nf = rhashtable_walk_next(&iter);
>>> 		}
>>> 
>>> @@ -1054,8 +1044,10 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>> 	rcu_read_lock();
>>> 	nf = rhashtable_lookup(&nfsd_file_rhash_tbl, &key,
>>> 			       nfsd_file_rhash_params);
>>> -	if (nf)
>>> -		nf = nfsd_file_get(nf);
>>> +	if (nf) {
>>> +		if (!nfsd_file_lru_remove(nf))
>>> +			nf = nfsd_file_get(nf);
>>> +	}
>>> 	rcu_read_unlock();
>>> 	if (nf)
>>> 		goto wait_for_construction;
>>> @@ -1090,11 +1082,11 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>> 			goto out;
>>> 		}
>>> 		open_retry = false;
>>> -		nfsd_file_put_noref(nf);
>>> +		if (refcount_dec_and_test(&nf->nf_ref))
>>> +			nfsd_file_free(nf);
>>> 		goto retry;
>>> 	}
>>> 
>>> -	nfsd_file_lru_remove(nf);
>>> 	this_cpu_inc(nfsd_file_cache_hits);
>>> 
>>> 	status = nfserrno(nfsd_open_break_lease(file_inode(nf->nf_file), may_flags));
>>> @@ -1104,7 +1096,8 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>> 			this_cpu_inc(nfsd_file_acquisitions);
>>> 		*pnf = nf;
>>> 	} else {
>>> -		nfsd_file_put(nf);
>>> +		if (refcount_dec_and_test(&nf->nf_ref))
>>> +			nfsd_file_free(nf);
>>> 		nf = NULL;
>>> 	}
>>> 
>>> @@ -1131,7 +1124,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>> 	 * then unhash.
>>> 	 */
>>> 	if (status != nfs_ok || key.inode->i_nlink == 0)
>>> -		nfsd_file_unhash_and_put(nf);
>>> +		nfsd_file_unhash(nf);
>>> 	clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags);
>>> 	smp_mb__after_atomic();
>>> 	wake_up_bit(&nf->nf_flags, NFSD_FILE_PENDING);
>>> diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
>>> index b09ab4f92d43..a44ded06af87 100644
>>> --- a/fs/nfsd/trace.h
>>> +++ b/fs/nfsd/trace.h
>>> @@ -903,10 +903,11 @@ DEFINE_EVENT(nfsd_file_class, name, \
>>> 	TP_PROTO(struct nfsd_file *nf), \
>>> 	TP_ARGS(nf))
>>> 
>>> -DEFINE_NFSD_FILE_EVENT(nfsd_file_put_final);
>>> +DEFINE_NFSD_FILE_EVENT(nfsd_file_free);
>>> DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash);
>>> DEFINE_NFSD_FILE_EVENT(nfsd_file_put);
>>> -DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash_and_dispose);
>>> +DEFINE_NFSD_FILE_EVENT(nfsd_file_closing);
>>> +DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash_and_queue);
>>> 
>>> TRACE_EVENT(nfsd_file_alloc,
>>> 	TP_PROTO(
>>> -- 
>>> 2.37.3
>>> 
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> 
> -- 
> Jeff Layton <jlayton@xxxxxxxxxx>

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux