[rfc] scalable write-free cached path walking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The status with my vfs scalability patchset is that I haven't had
much time to move it along a lot since last posting. But I had found
is that scalability of some operations actually got worse! Like parallel
open/close on different files in a directory.

The reason is, as Al suspected, that I'm using d_lock for protecting
d_count rather than atomic d_count, and scalability in the contended
case is worse for the lock than the atomic.

I would rather not have to move it back to an atomic, because it would
probably make the single threaded performance worse. And also we
still have cacheline pingpong during path walk.

So I looked at a few ways to go. Firstly, it would be possible to add
a more sophisticated reference counting scheme to the path structure;
then you could clone the cwd refs for cwd lookup, hold the fd open
for fdat lookups, etc. This actually gets pretty complex with all the
path games that the lookup code plays. It would be possible, but OTOH
there will still be pingpong for all common path elements.

So another possibility is to use RCU and avoid taking references on
the path elements we look up. This should be able to work in common
cases I think. When we run into trouble, we can proceed as-in the
existing code today.

Trouble would be like if a real lookup is required, or if an fs call
is required, or if we need to look at the inode or call a security
policy, or possibly traverse a mount.

I think we can copy data necessary for exec_permission_lite from the
inode into the dcache in order to make those checks possible, or even
make the inode RCU freed to check some fields, if we're careful with
permission/ownership changes etc (might require a seqlock).

At that point, we should be able to look up "easy" cached paths
with no cacheline writes except on the final element.

I haven't quite got anything working yet, but I'm tinkering with it.
Anyone had thoughts along the same lines? I'll try to have a patch
out soon so people can concretely destroy my dreams... but any ideas
are welcome at this point.

Thanks,
Nick


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux