Re: [PATCH][RFC] NFS: Improving the access cache

Peter Staubach <staubach@xxxxxxxxxx> · Wed, 26 Apr 2006 13:01:58 -0400

Trond Myklebust wrote:

On Wed, 2006-04-26 at 10:15 -0400, Peter Staubach wrote:

What situations? AFAIA the number of processes in a typical setup are
almost always far smaller than the number of cached inodes.

The situation that doesn't scale is one where there are many different
users on the system.  It is the situation where there are more then just
a few users per file.  This can happen on compute servers or systems
used for timesharing sorts of purposes.

Yes, but the number of users <= number of processes which even on those
systems is almost always much, much less than the number of cached
inodes.

There isn't a 1-to-1 correspondence between processes and files.  A single
process accesses many different files and many of the processes will be
accessing the same files.  Shared libraries are easy examples of files
which are accessed by multiple processes and processes themselves access
multiple shared libraries.

For instance on my laptop, I'm currently running 146 processes, but
according to /proc/slabinfo I'm caching 330000 XFS inodes + 141500 ext3
inodes.
If I were to assume that a typical nfsroot system will show roughly the
same behaviour, then it would mean that a typical bucket in Steve's 256
hash entry table will contain at least 2000 entries that I need to
search through every time I want to do an access call.

For such a system, there needs to be more than 256 hash buckets.  The number
of the access cache hash buckets needs to be on scale with the number of 
hash
buckets used for similarly sized caches and tables.

The inode cache is the only similarly sized cache I can think of.

That is set either by the user, or it takes a default value of (total
memory size) / 2^14 buckets (see alloc_large_system_hash). On a 1Gb
system, that makes the default hash table size ~ 65536 entries. I can't
see people wanting to put up with a 256K static hash table for access
caching too.

I think that if the performance benefits warrant such a cache, then it is
worth it.  It is a very small percentage of the real memory on the system.

Previous, informal, studies showed that caching access privileges like
this was good at short circuiting 90%+ of access calls.

However, we could always divide this further when sizing the access
cache.  If we assume that 1/2 or 1/4 or some percentage of the files
accessed will be on NFS mounted file systems, then the access cache just
needs to be based on the number of NFS inodes, not the total number of
inodes.

Furthermore, note that the inode cache is only searched when
initialising a dentry. It is not searched on _every_ traversal of a path
element.

Very true, which points out the importance of getting the access to the 
access
cache correct and fast.  The number of entries in the access cache will 
be at
least the number of NFS inodes in the system and could be much higher 
depending
upon whether the system is single-user, desktop style system, or a 
multi-user
shared system.  The key to making this cache cheap is to make the hash 
algorithm
cheap and keeping the hash chains short.

      ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html