So call me crazy, but due to entirely unrelated changes (the x86 barrier_nospec optimization) I was doing some profiles to check that everything looked fine. And just looking at kernel profiles, I noticed that "generic_permission()" wasn't entirely in the noise. It was right there along with strncpy_from_user() etc. Which is a bit surprising, because it should be really cheap to check basic Unix permissions? It's all really just "acl_permission_check()" and that code is actually fairly optimized, except that the whole vfsuid = i_uid_into_vfsuid(idmap, inode); to check whether we're the owner is *not* cheap. It causes a call to make_vfsuid(), and it's just messy. Which made me wonder: we already have code that says "if the Group and Other permission bits are the same wrt the mask we are checking, don't bother doing the expensive group checks". Why not extend that to "if any of the UGO choices are ok with the permissions, why bother looking up ownership at all?" Now, there is one reason to look up the owner: POSIX ACL's don't matter to owners, but do matter to others. But we could check for the cached case of "no POSIX ACLs" very cheaply, and only do it for that case. IOW, a patch something like the attached. No, I didn't really check whether it makes any difference at all. But my gut feel is that a *lot* of permission checks succeed for any user, with things like system directories are commonly drwxr-xr-x, so if you want to check read or execute permissions, it really doesn't matter whether you are the User, the Group, or Other. So thus the code does that unsigned int all; all = mode & (mode >> 3); // combine g into o all &= mode >> 6; // ... and u into o so now the low three bits of 'all' are the bits that *every* case has set. And then if (!(mask & ~all & 7)) return 0; basically says "if what we are asking for is not zero in any of those bits, we're good". And it's entirely possible that I'm missing something silly and am being just stupid. Somebody hit me with the clue bat if so. Linus
fs/namei.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/fs/namei.c b/fs/namei.c index 4a4a22a08ac2..6aeabde0ec9f 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -326,6 +326,25 @@ static int check_acl(struct mnt_idmap *idmap, return -EAGAIN; } +/* + * Very quick optimistic "we know we have no ACL's" check. + * + * Note that this is purely for ACL_TYPE_ACCESS, and purely + * for the "we have cached that there are no ACLs" case. + * + * If this returns true, we know there are no ACLs. But if + * it returns false, we might still not have ACLs (it could + * be the is_uncached_acl() case). + */ +static inline bool no_acl_inode(struct inode *inode) +{ +#ifdef CONFIG_FS_POSIX_ACL + return likely(!READ_ONCE(inode->i_acl)); +#else + return true; +#endif +} + /** * acl_permission_check - perform basic UNIX permission checking * @idmap: idmap of the mount the inode was found from @@ -348,6 +367,19 @@ static int acl_permission_check(struct mnt_idmap *idmap, unsigned int mode = inode->i_mode; vfsuid_t vfsuid; + /* + * Common cheap case: everybody has the requested + * rights, and there are no ACLs to check. No need + * to do any owner/group checks. + */ + if (no_acl_inode(inode)) { + unsigned int all; + all = mode & (mode >> 3); // combine g into o + all &= mode >> 6; // ... and u into o + if (!(mask & ~all & 7)) + return 0; + } + /* Are we the owner? If so, ACL's don't matter */ vfsuid = i_uid_into_vfsuid(idmap, inode); if (likely(vfsuid_eq_kuid(vfsuid, current_fsuid()))) {