On Tue, Jan 12, 2021 at 11:55 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Tue, Jan 12, 2021 at 10:59:58AM -0800, Shakeel Butt wrote: > > On Tue, Jan 12, 2021 at 9:12 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > > > When a value is written to a cgroup's memory.high control file, the > > > write() context first tries to reclaim the cgroup to size before > > > putting the limit in place for the workload. Concurrent charges from > > > the workload can keep such a write() looping in reclaim indefinitely. > > > > > > > Is this observed on real workload? > > Yes. > > On several production hosts running a particularly aggressive > workload, we've observed writers to memory.high getting stuck for > minutes while consuming significant amount of CPU. > Good to add this in the commit message or at least mentioning that it happened in production. > > Any particular reason to remove !reclaimed? > > It's purpose so far was to allow successful reclaim to continue > indefinitely, while restricting no-progress loops to 'nr_retries'. > > Without the first part, it doesn't really matter whether reclaim is > making progress or not: we do a maximum of 'nr_retries' loops until > the cgroup size meets the new limit, then exit one way or another. Does it make sense to add this in the commit message as well? I am fine with either way. For the patch: Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx>