Re: [PATCH v6 06/17] fsnotify: generate pre-content permission event on open

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 12 Nov 2024 at 07:24, Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
>
> But this was an entirely inappropriate way to communicate your point, even with
> people who have been here a while and are used to being yelled at.

Fair enough.

I was probably way too frustrated, because I spent some of the last
few weeks literally trying to remove two cache misses from the
permission checking path.

And those two cache misses were for features that are in POSIX made
worse because of uid translations infrastructure used for containers.

Those are both things that people actually *use* in major ways.
Admittedly the particular optimization was exactly to _not_ bother
with looking up the exact uid details when we could tell early that it
was all unnecessary, but the point was that this was to optimize
something real, something important, and not some special case.

And here's an example of a profile I've been looking at (this is a
real load: the profile is literally my "make allmodconfig" build with
not a lot of actual rebuilding needed):

  1.30%  avc_has_perm_noaudit
  1.24%  selinux_inode_permission
  1.18%  link_path_walk.part.0.constprop.0
  0.99%  btrfs_getattr
  0.82%  clear_page_rep
  0.82%  security_inode_permission
  0.60%  dput
  0.60%  path_lookupat
  0.60%  __check_object_size
  0.54%  strncpy_from_user
  0.51%  inode_permission
  0.50%  generic_permission
  0.47%  kmem_cache_alloc_noprof
  0.47%  step_into
  0.46%  btrfs_permission
  0.45%  cp_new_stat
  0.44%  filename_lookup

and hey, you go "most of those are just around half a percent". Sure.
But add those things up, and you'll see that they add up to about 12%.

And yes, that 12% is 1/8th of the whole load. So it's not some "just
12% of the kernel footprint". It's literally 12% of one of the main
loads I run day-to-day, which is why I care about these paths so
deeply.

And look at the two top offenders. No, they aren't fsnotify. But guess
what they are? They are hooks in the VFS layer that the VFS code
cannot do anything about and cannot optimize as a result. They do
ABSOLUTELY NOTHING on this load, and they add no actual value.

If they had some kind of "I don't have any special security label"
test, the two top offenders (and down that list you can find a third
one) would likely just not exist at all. But they aren't "real" VFS
functions, they are just hooks into a black hole with random semantics
that is set by user-supplied tables, so we don't have that.

So this is why I'm putting my foot down. No more badly coded and badly
designed hooks that the VFS layer can't just optimize away for the
common case.

I can't fix the existing issues. But I can say "no more". If you want
new hooks, they had better have effectively *ZERO* overhead when they
aren't relevant.

And yes, I was too forceful. Sorry. But my foot stays down.

If you want a specialty hook for specialty uses, that hook needs to be
so *fundamentally* low-cost that I simply don't need to worry about
it.

It needs to have a flag that doesn't take a cache miss that says
"Nobody cares", and doesn't call out to random pointless functions
that cause I$ misses either.

I really wish I had made that requirement for the security people all
those years ago. Because 99% of the time, all that stupid selinux
noise is literally worthless and does nothing. But we don't have the
flag that says "nothing to see".

               Linus




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux