On 3/4/2022 12:48 am, Maciej W. Rozycki wrote:
On Thu, 10 Mar 2022, yaliang.wang@xxxxxxxxxxxxx wrote:
pgd page is freed by generic implementation pgd_free() since commit
f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
however, there are scenarios that the system uses more than one page as
the pgd table, in such cases the generic implementation pgd_free() won't
be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
will be set as "1", which will cause allocating two pages as the pgd
table. Well, at the same time, the generic implementation pgd_free()
just free one pgd page, which will result in the memory leak.
The memory leak can be easily detected by executing shell command:
"while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"
Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
Signed-off-by: Yaliang Wang <Yaliang.Wang@xxxxxxxxxxxxx>
As a critical regression shouldn't this have been marked for backporting
to stable branches?
Very yes please - this bug has been driving several of us at OpenWrt
crazy for quite[1] some[2] time now, mostly on Octeon devices. We'd
(wrongly) suspected the octeon-ethernet driver, but this morning finally
bisected it down to f9cb654cb550 and can confirm this patch fixes the
regression.
MIPS64 has essentially been broken/unusable for 8 kernel releases,
including two LTS kernels, since the original commit landed. Should
there not have been CI/tests that caught this? It's pretty major!
- Andrew
[1]
https://forum.openwrt.org/t/oom-killer-dnsmasq-when-physical-free-ram-remains/109351
[2]
https://forum.openwrt.org/t/upstream-kernel-memleak-5-10-octeon-ethernet-ko/111827