Re: [PATCH][RFC] NFS: Improving the access cache

Peter Staubach <staubach@xxxxxxxxxx> · Wed, 26 Apr 2006 10:15:50 -0400

Trond Myklebust wrote:

On Wed, 2006-04-26 at 09:14 -0400, Peter Staubach wrote:

Trond Myklebust wrote:

On Tue, 2006-04-25 at 21:14 -0400, Steve Dickson wrote:

Currently the NFS client caches ACCESS information on a per uid basis
which fall apart when different process with different uid consistently
access the same directory. The end result being a storm of needless
ACCESS calls...

The attached patch used a hash table to store the nfs_access_entry
entires which cause the ACCESS request to only happen when the
attributes timeout.. The table is indexed by the addition of the
nfs_inode pointer and the cr_uid in the cred structure which should
spread things out nicely for some decent scalability (although the
locking scheme may need to be reworked a bit). The table has 256 entries
of struct list_head giving it a total size of 2k.

Instead of having the field 'id', why don't you let the nfs_inode keep a
small (hashed?) list of all the nfs_access_entry objects that refer to
it? That would speed up searches for cached entries.

I agree with Neil's assessment that we need a bound on the size of the
cache. In fact, enforcing a bound is pretty much the raison d'être for a
global table (by which I mean that if we don't need a bound, then we
might as well cache everything in the nfs_inode).
How about rather changing that hash table into an LRU list, then adding
a shrinker callback (using set_shrinker()) to allow the VM to free up
entries when memory pressure dictates that it must?

Previous implementations have shown that a single per inode linear 
linked list
ends up not being scalable enough in certain situations.  There would end up
being too many entries in the list and searching the list would become
a bottleneck.  Adding a set of hash buckets per inode also proved to be
inefficient because in order to have enough hash buckets to make the hashing
efficient, much space was wasted.  Having a single set of hash buckets,
adequately sized, ended up being the best solution.

What situations? AFAIA the number of processes in a typical setup are
almost always far smaller than the number of cached inodes.

The situation that doesn't scale is one where there are many different
users on the system.  It is the situation where there are more then just
a few users per file.  This can happen on compute servers or systems
used for timesharing sorts of purposes.

For instance on my laptop, I'm currently running 146 processes, but
according to /proc/slabinfo I'm caching 330000 XFS inodes + 141500 ext3
inodes.
If I were to assume that a typical nfsroot system will show roughly the
same behaviour, then it would mean that a typical bucket in Steve's 256
hash entry table will contain at least 2000 entries that I need to
search through every time I want to do an access call.

For such a system, there needs to be more than 256 hash buckets.  The number
of the access cache hash buckets needs to be on scale with the number of 
hash
buckets used for similarly sized caches and tables.

      ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html