Re: [PATCH v4 1/1] mm: vmscan: Reduce throttling due to a failure to make progress

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 30.12.21 00:45, Andrew Morton wrote:
> On Tue, 28 Dec 2021 11:04:18 +0100 Thorsten Leemhuis <regressions@xxxxxxxxxxxxx> wrote:
> 
>> Hi, this is your Linux kernel regression tracker speaking.
>>
>> On 02.12.21 16:06, Mel Gorman wrote:
>>> Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
>>> problems due to reclaim throttling for excessive lengths of time.
>>> In Alexey's case, a memory hog that should go OOM quickly stalls for
>>> several minutes before stalling. In Mike and Darrick's cases, a small
>>> memcg environment stalled excessively even though the system had enough
>>> memory overall.
>>
>> Just wondering: this patch afaics is now in -mm and  Linux next for
>> nearly two weeks. Is that intentional? I had expected it to be mainlined
>> with the batch of patches Andrew mailed to Linus last week, but it
>> wasn't among them.
> 
> I have it queued for 5.17-rc1.
> 
> There is still time to squeeze it into 5.16, just, with a cc:stable. 
> 
> Alternatively we could merge it into 5.17-rc1 with a cc:stable, so it
> will trickle back with less risk to the 5.17 release.
> 
> What do people think?

CCing Linus, to make sure he's aware of this.

Maybe I'm totally missing something, but I'm a bit confused by what you
wrote, as the regression afaik was introduced between v5.15..v5.16-rc1.
So I assume this is what you meant:

```
I have it queued for 5.17-rc1.

There is still time to squeeze it into 5.16.

Alternatively we could merge it into 5.17-rc1 with a cc:stable, so it
will trickle back with less risk to the 5.16 release.

What do people think?
```

I'll leave the individual risk evaluation of the patch to others. If the
fix is risky, waiting for 5.17 is fine for me.

But hmmm, regarding the "could merge it into 5.17-rc1 with a cc:stable"
idea a remark: is that really "less risk", as your stated?

If we get it into rc8 (which is still possible, even if a bit hard due
to the new year festivities), it will get at least one week of testing.

If the fix waits for the next merge window, it all depends on the how
the timing works out. But it's easy to picture a worst case: the fix is
only merged on the Friday evening before Linus releases 5.17-rc1 and
right after it's out makes it into a stable-rc (say a day or two after
5.17-rc1 is out) and from there into a 5.16.y release on Thursday. That
IMHO would mean less days of testing in the end (and there is a weekend
in this period as well).

Waiting obviously will also mean that users of 5.16 and 5.16.y will
likely have to face this regression for at least two and a half weeks,
unless you send the fix early and Greg backports it before rc1 (which he
afaics does if there are good reasons). Yes, it's `just` a performance
regression, so it might not stop anyone from running Linux 5.16 -- but
it's one that three people separately reported in the 5.16 devel cycle,
so others will likely encounter it as well if we leave it unfixed in
5.16. This will likely annoy some people, especially if they invest time
in bisecting it, only to find out that the forth iteration of the fix
for the regression is already available since December the 2nd.

Ciao, Thorsten



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux