On Wed, Oct 30, 2024 at 06:16:22PM -1000, Linus Torvalds wrote: > So call me crazy, but due to entirely unrelated changes (the x86 > barrier_nospec optimization) I was doing some profiles to check that > everything looked fine. > > And just looking at kernel profiles, I noticed that > "generic_permission()" wasn't entirely in the noise. It was right > there along with strncpy_from_user() etc. Which is a bit surprising, > because it should be really cheap to check basic Unix permissions? > > It's all really just "acl_permission_check()" and that code is > actually fairly optimized, except that the whole > > vfsuid = i_uid_into_vfsuid(idmap, inode); > > to check whether we're the owner is *not* cheap. It causes a call to > make_vfsuid(), and it's just messy. I assume you ran your benchmark on baremetal so no containers or idmappings? I find this somewhat suprising. One thing to optimize here independent of your proposal would be to try and __always_inline make_vfsuid(). > > Which made me wonder: we already have code that says "if the Group and > Other permission bits are the same wrt the mask we are checking, don't > bother doing the expensive group checks". > > Why not extend that to "if any of the UGO choices are ok with the > permissions, why bother looking up ownership at all?" > > Now, there is one reason to look up the owner: POSIX ACL's don't > matter to owners, but do matter to others. > > But we could check for the cached case of "no POSIX ACLs" very > cheaply, and only do it for that case. > > IOW, a patch something like the attached. I think this should work (though I find the permission checking model in general to be an unruly mess - that's ignoring the fun hardening bits of it...). Can you send give me a proper SoB patch? Liberal comments in the code would be appreciated. I'd like to put this through LTP and some of the permission tests I've written over the years. > > No, I didn't really check whether it makes any difference at all. But > my gut feel is that a *lot* of permission checks succeed for any user, > with things like system directories are commonly drwxr-xr-x, so if you > want to check read or execute permissions, it really doesn't matter > whether you are the User, the Group, or Other. > > So thus the code does that > > unsigned int all; > all = mode & (mode >> 3); // combine g into o > all &= mode >> 6; // ... and u into o > > so now the low three bits of 'all' are the bits that *every* case has > set. And then > > if (!(mask & ~all & 7)) > return 0; > > basically says "if what we are asking for is not zero in any of those > bits, we're good". > > And it's entirely possible that I'm missing something silly and am > being just stupid. Somebody hit me with the clue bat if so. > > Linus