Re: [PATCH v2 0/3] mm: Make unregistration of super_block shrinker more faster

Qi Zheng <qi.zheng@xxxxxxxxx> · Wed, 7 Jun 2023 10:51:35 +0800

On 2023/6/7 06:02, Dave Chinner wrote:
On Wed, Jun 07, 2023 at 12:06:03AM +0300, Kirill Tkhai wrote:
On 06.06.2023 01:32, Dave Chinner wrote:
On Mon, Jun 05, 2023 at 10:02:46PM +0300, Kirill Tkhai wrote:
This patch set introduces a new scheme of shrinker unregistration. It allows to split
the unregistration in two parts: fast and slow. This allows to hide slow part from
a user, so user-visible unregistration becomes fast.

This fixes the -88.8% regression of stress-ng.ramfs.ops_per_sec noticed
by kernel test robot:

https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@xxxxxxxxx/

---

Kirill Tkhai (2):
       mm: Split unregister_shrinker() in fast and slow part
       fs: Use delayed shrinker unregistration

Did you test any filesystem other than ramfs?

Filesystems more complex than ramfs have internal shrinkers, and so
they will still be running the slow synchronize_srcu() - potentially
multiple times! - in every unmount. Both XFS and ext4 have 3
internal shrinker instances per mount, so they will still call
synchronize_srcu() at least 3 times per unmount after this change.

What about any other subsystem that runs a shrinker - do they have
context depedent shrinker instances that get frequently created and
destroyed? They'll need the same treatment.

Of course, all of shrinkers should be fixed. This patch set just aims to describe
the idea more wider, because I'm not sure most people read replys to kernel robot reports.

Thank you, Kirill.

This is my suggestion of way to go. Probably, Qi is right person to ask whether
we're going to extend this and to maintain f95bdb700bc6 in tree.

There is not much time. Unfortunately, kernel test robot reported this significantly late.

And that's why it should be reverted rather than trying to rush to
try to fix it.

I'm kind of tired of finding out about mm reclaim regressions only
when I see patches making naive and/or broken changes to subsystem
shrinker implementations without any real clue about what they are
doing.  If people/subsystems who maintain shrinker implementations
were cc'd on the changes to the shrinker implementation, this would
have all been resolved before merging occurred....

Lockless shrinker lists need a heap of supporting changes to be done
first so that they aren't reliant on synchronise_srcu() *at all*. If
the code was properly designed in the first place (i.e. dynamic
shrinker structures freed via call_rcu()), we wouldn't be in rushing
to fix weird regressions right now.

Can we please revert this and start again with a properly throught
out and reveiwed design?

I have no idea on whether to revert this, I follow the final decision of
the community.

From my personal point of view, I think it is worth sacrificing the
speed of unregistration alone compared to the benefits it brings
(lockless shrink, etc).

Of course, it would be better if there is a more perfect solution.
If you have a better idea, it might be better to post the code first for
discussion. Otherwise, I am afraid that if we just revert it, the
problem of shrinker_rwsem will continue for many years.

And hi Dave, I know you're mad that I didn't cc you in the original
patch. Sorry again. How about splitting shrinker-related codes into
the separate files? Then we can add a MAINTAINERS entry to it and add
linux-fsdevel@xxxxxxxxxxxxxxx to this entry? So that future people
will not miss to cc fs folks.

Qi.

-Dave.