Re: generic_permission() optimization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 07, 2024 at 09:54:36AM -1000, Linus Torvalds wrote:
> On Thu, 31 Oct 2024 at 12:31, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Added some stats, and on my load (reading email in the web browser,
> > some xterms and running an allmodconfig kernel build), I get about a
> > 45% hit-rate for the fast-case: out of 44M calls to
> > generic_permission(), about 20M hit the fast-case path.
> 
> So the 45% hit rate really bothered me, because on the load I was
> testing I really thought it should be 100%.
> 
> And in fact, sometimes it *was* 100% when I did profiles, and I never
> saw the slow case at all. So I saw that odd bimodal behavior where
> sometimes about half the accesses went through the slow path, and
> sometimes none of them did.
> 
> It took me way too long to realize why that was the case:  the quick
> "do we have ACL's" test works wonderfully well when the ACL
> information is cached, but the cached case isn't always filled in.
> 
> For some unfathomable reason I just mindlessly thought that "if the
> ACL info isn't filled in, and we will go to the slow case, it now
> *will* be filled in, so next time around we'll have it in the cache".
> 
> But that was just silly of me. We may never call "check_acl()" at all,
> because if we do the lookup as the owner, we never even bother to look
> up any ACL information:
> 
>         /* Are we the owner? If so, ACL's don't matter */
> 
> So next time around, the ACL info *still* won't be filled in, and so
> we *still* won't take the fastpath.
> 
> End result: that patch is not nearly as effective as I would have
> liked. Yes, it actually gets reasonable hit-rates, but the
> ACL_NOT_CACHED state ends up being a lot stickier than my original
> mental model incorrectly throught it would be.
> 

How about filesystems maintaing a flag: IOP_EVERYONECANTRAREVERSE?
The name is a keybordfull and not the actual proposal.

Rationale:
To my reading generic_permission gets called for all path components,
where almost all of them just want to check if they can traverse.

So happens for vast majority of real path components the x is there for
*everyone*. Even in case of /home/$user/crap, while the middle dir has x
only for the owner and maybe the group, everything *below* tends to also
be all x.

I just did a kernel build while poking at the state with bpftrace:
bpftrace -e 'kprobe:generic_permission { @[(((struct inode *)arg1)->i_mode & 0x49) == 0x49] = count(); }'

result:
@[0]: 5623736
@[1]: 64867147

iow in 92% of calls everyone had x. Also note this collects calls for
non-traversal, so the real hit ratio is higher so to speak. I don't use
acls here so they were of no consequence anyway btw.

So if a filesystem cares to be faster, when instatianating an inode or
getting setattr called on it it can (re)compute if there is anything
blocking x for anyone. If nothing is in the way it can the flag and
allow link_path_walk to skip everything, otherwise *unset* the flag (as
needed).

This is completely transparent to filesystems which don't participate.

So that would be my proposal, no interest in coding it.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux