On Mon, May 30, 2022 at 10:41:30PM -0400, Waiman Long wrote: > On 5/30/22 03:49, Muchun Song wrote: > > This version is rebased on v5.18. > > > > Since the following patchsets applied. All the kernel memory are charged > > with the new APIs of obj_cgroup. > > > > [v17,00/19] The new cgroup slab memory controller [1] > > [v5,0/7] Use obj_cgroup APIs to charge kmem pages [2] > > > > But user memory allocations (LRU pages) pinning memcgs for a long time - > > it exists at a larger scale and is causing recurring problems in the real > > world: page cache doesn't get reclaimed for a long time, or is used by the > > second, third, fourth, ... instance of the same job that was restarted into > > a new cgroup every time. Unreclaimable dying cgroups pile up, waste memory, > > and make page reclaim very inefficient. > > > > We can convert LRU pages and most other raw memcg pins to the objcg direction > > to fix this problem, and then the LRU pages will not pin the memcgs. > > > > This patchset aims to make the LRU pages to drop the reference to memory > > cgroup by using the APIs of obj_cgroup. Finally, we can see that the number > > of the dying cgroups will not increase if we run the following test script. > > > > ```bash > > #!/bin/bash > > > > dd if=/dev/zero of=temp bs=4096 count=1 > > cat /proc/cgroups | grep memory > > > > for i in {0..2000} > > do > > mkdir /sys/fs/cgroup/memory/test$i > > echo $$ > /sys/fs/cgroup/memory/test$i/cgroup.procs > > cat temp >> log > > echo $$ > /sys/fs/cgroup/memory/cgroup.procs > > rmdir /sys/fs/cgroup/memory/test$i > > done > > > > cat /proc/cgroups | grep memory > > > > rm -f temp log > > ``` > > > > [1] https://lore.kernel.org/linux-mm/20200623015846.1141975-1-guro@xxxxxx/ > > [2] https://lore.kernel.org/linux-mm/20210319163821.20704-1-songmuchun@xxxxxxxxxxxxx/ > > > > v4: https://lore.kernel.org/all/20220524060551.80037-1-songmuchun@xxxxxxxxxxxxx/ > > v3: https://lore.kernel.org/all/20220216115132.52602-1-songmuchun@xxxxxxxxxxxxx/ > > v2: https://lore.kernel.org/all/20210916134748.67712-1-songmuchun@xxxxxxxxxxxxx/ > > v1: https://lore.kernel.org/all/20210814052519.86679-1-songmuchun@xxxxxxxxxxxxx/ > > RFC v4: https://lore.kernel.org/all/20210527093336.14895-1-songmuchun@xxxxxxxxxxxxx/ > > RFC v3: https://lore.kernel.org/all/20210421070059.69361-1-songmuchun@xxxxxxxxxxxxx/ > > RFC v2: https://lore.kernel.org/all/20210409122959.82264-1-songmuchun@xxxxxxxxxxxxx/ > > RFC v1: https://lore.kernel.org/all/20210330101531.82752-1-songmuchun@xxxxxxxxxxxxx/ > > > > v5: > > - Lots of improvements from Johannes, Roman and Waiman. > > - Fix lockdep warning reported by kernel test robot. > > - Add two new patches to do code cleanup. > > - Collect Acked-by and Reviewed-by from Johannes and Roman. > > - I didn't replace local_irq_disable/enable() to local_lock/unlock_irq() since > > local_lock/unlock_irq() takes an parameter, it needs more thinking to transform > > it to local_lock. It could be an improvement in the future. > > My comment about local_lock/unlock is just a note that > local_irq_disable/enable() have to be eventually replaced. However, we need > to think carefully where to put the newly added local_lock. It is perfectly > fine to keep it as is and leave the conversion as a future follow-up. > Totally agree. > Thank you very much for your work on this patchset. > Thanks.