Re: [PATCH 3/3] s390/mm: fix mis-accounting of pgtable_bytes

Li Wang <liwang@xxxxxxxxxx> · Wed, 31 Oct 2018 14:43:38 +0800

On Wed, Oct 31, 2018 at 2:31 PM, Martin Schwidefsky <schwidefsky@xxxxxxxxxx> wrote:
On Wed, 31 Oct 2018 14:18:33 +0800

Li Wang <liwang@xxxxxxxxxx> wrote:

> On Tue, Oct 16, 2018 at 12:42 AM, Martin Schwidefsky <schwidefsky@xxxxxxxxxx

> > wrote:  

> 

> > In case a fork or a clone system fails in copy_process and the error

> > handling does the mmput() at the bad_fork_cleanup_mm label, the

> > following warning messages will appear on the console:

> >

> >   BUG: non-zero pgtables_bytes on freeing mm: 16384

> >

> > The reason for that is the tricks we play with mm_inc_nr_puds() and

> > mm_inc_nr_pmds() in init_new_context().

> >

> > A normal 64-bit process has 3 levels of page table, the p4d level and

> > the pud level are folded. On process termination the free_pud_range()

> > function in mm/memory.c will subtract 16KB from pgtable_bytes with a

> > mm_dec_nr_puds() call, but there actually is not really a pud table.

> >

> > One issue with this is the fact that pgtable_bytes is usually off

> > by a few kilobytes, but the more severe problem is that for a failed

> > fork or clone the free_pgtables() function is not called. In this case

> > there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together with

> > the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context().

> > The pgtable_bytes will be off by 16384 or 32768 bytes and we get the

> > BUG message. The message itself is purely cosmetic, but annoying.

> >

> > To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_folded

> > function to check for the true size of the address space.

> >  

> 

> I can confirm that it works to the problem, the warning message is gone

> after applying this patch on s390x. And I also done ltp syscalls/cve test

> for the patch set on x86_64 arch, there has no new regression.

> 

> Tested-by: Li Wang <liwang@xxxxxxxxxx>

Thanks for testing. Unfortunately Heiko reported another issue yesterday

with the patch applied. This time the other way around:

BUG: non-zero pgtables_bytes on freeing mm: -16384

Okay, the problem is still triggered by LTP/cve-2017-17052.c? 
I tried this patch on my platform and it works! My test environment as:

# lscpu
Architecture:          s390x
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Big Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s) per book:    1
Book(s) per drawer:    1
Drawer(s):             2
Vendor ID:             IBM/S390
Machine type:          2827
CPU dynamic MHz:       5504
CPU static MHz:        5504
BogoMIPS:              2913.00
Hypervisor vendor:     vertical
Virtualization type:   full
Dispatching mode:      horizontal
L1d cache:             96K
L1i cache:             64K
L2d cache:             1024K
L2i cache:             1024K
L3 cache:              49152K
L4 cache:              393216K
Flags:                 esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs te sie

I am trying to understand how this can happen. For now I would like to

keep the patch on hold in case they need another change.

Sure.

-- 

blue skies,

   Martin.

"Reality continues to ruin my life." - Calvin.

-- 
Regards,
Li Wang