Guys, we've been working on scaling path lookup again, and 3.12 will be pretty kick-ass on many loads. HOWEVER, there's one really annoying case: we do all the path lookup in RCU mode, where we can avoid touching any dentry state etc at all, but then to "finalize" the path lookup in order to use it long-term, we have to increment reference counts etc. That can be somewhat expensive for high-scalability cases, since it will force-dirty (and thus make exclusive in the cache) the core dentry cache entry. And that's all fine for a lot of the common cases: when you do things like open a file, you really do have to increment the reference count, and there's no question that we do the right thing. And most people who really open files tend to work on them, so getting the dentry exclusively is fine. There's one very important exception, though: things like "stat()" and "access()" do *not* open a file in order to hold on to it. And they are quite common (stat() in particular), _and_ they are often done on files that are shared and then passed by rather than worked on.. Now, interestingly, both stat() and access() actually _already_ do some kind of retry for special circumstances (LOOKUP_REVAL), and that really looks like it could be extended to just do the whole lookup in RCU mode too, and thus avoid ever finalizing the pathname. However, the LSM interface doesn't really allow for that. So how do people feel about passing a "mode" value for security_inode_getattr(), the same way we do for security_inode_permission()? The only flag would be that MAY_NOT_BLOCK flag that gets set for RCU lookup, and the semantics would be the same (return -ECHILD if you need to sleep). Attached is a patch that adds the interface, and then makes all security layers just do that ECHILD thing (and nobody actually sets MAY_NOT_BLOCK yet). So it's purely preparatory. It's also insufficient, because we'll need the same kind o fflag for the low-level filesystem "i_op->getattr()" call, but that's an independent issue. Al, any comments? Linus
Attachment:
patch.diff
Description: Binary data