[ adding -stable ] The patch below is upstream as commit fc5f9d5f151c "x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds()". The referenced bug potentially affects all kaslr enabled kernels with > 512GB of memory. Please apply this patch to all current -stable kernels. On Fri, May 5, 2017 at 1:11 AM, tip-bot for Baoquan He <tipbot@xxxxxxxxx> wrote: > Commit-ID: fc5f9d5f151c9fff21d3d1d2907b888a5aec3ff7 > Gitweb: http://git.kernel.org/tip/fc5f9d5f151c9fff21d3d1d2907b888a5aec3ff7 > Author: Baoquan He <bhe@xxxxxxxxxx> > AuthorDate: Thu, 4 May 2017 10:25:47 +0800 > Committer: Ingo Molnar <mingo@xxxxxxxxxx> > CommitDate: Fri, 5 May 2017 08:21:24 +0200 > > x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds() > > Jeff Moyer reported that on his system with two memory regions 0~64G and > 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR > will make the system hang intermittently during boot. While adding 'nokaslr' > won't. > > The back trace is: > > Oops: 0000 [#1] SMP > > RIP: memcpy_erms() > [ .... ] > Call Trace: > pmem_rw_page() > bdev_read_page() > do_mpage_readpage() > mpage_readpages() > blkdev_readpages() > __do_page_cache_readahead() > force_page_cache_readahead() > page_cache_sync_readahead() > generic_file_read_iter() > blkdev_read_iter() > __vfs_read() > vfs_read() > SyS_read() > entry_SYSCALL_64_fastpath() > > This crash happens because the for loop count calculation in sync_global_pgds() > is not correct. When a mapping area crosses PGD entries, we should > calculate the starting address of region which next PGD covers and assign > it to next for loop count, but not add PGDIR_SIZE directly. The old > code works right only if the mapping area is an exact multiple of PGDIR_SIZE, > otherwize the end region could be skipped so that it can't be synchronized > to all other processes from kernel PGD init_mm.pgd. > > In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than > PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it > makes this area be mapped inside one PGD entry. With KASLR enabled, > this area could cross two PGD entries, then the next PGD entry won't > be synced to all other processes. That is why we saw empty PGD. > > Fix it. > > Reported-by: Jeff Moyer <jmoyer@xxxxxxxxxx> > Signed-off-by: Baoquan He <bhe@xxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Andy Lutomirski <luto@xxxxxxxxxx> > Cc: Borislav Petkov <bp@xxxxxxxxx> > Cc: Brian Gerst <brgerst@xxxxxxxxx> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > Cc: Dave Young <dyoung@xxxxxxxxxx> > Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx> > Cc: H. Peter Anvin <hpa@xxxxxxxxx> > Cc: Jinbum Park <jinb.park7@xxxxxxxxx> > Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx> > Cc: Kees Cook <keescook@xxxxxxxxxxxx> > Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Thomas Garnier <thgarnie@xxxxxxxxxx> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: Yasuaki Ishimatsu <yasu.isimatu@xxxxxxxxx> > Cc: Yinghai Lu <yinghai@xxxxxxxxxx> > Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@xxxxxxxxxx > Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> > --- > arch/x86/mm/init_64.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 745e5e1..97fe887 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -94,10 +94,10 @@ __setup("noexec32=", nonx32_setup); > */ > void sync_global_pgds(unsigned long start, unsigned long end) > { > - unsigned long address; > + unsigned long addr; > > - for (address = start; address <= end; address += PGDIR_SIZE) { > - pgd_t *pgd_ref = pgd_offset_k(address); > + for (addr = start; addr <= end; addr = ALIGN(addr + 1, PGDIR_SIZE)) { > + pgd_t *pgd_ref = pgd_offset_k(addr); > const p4d_t *p4d_ref; > struct page *page; > > @@ -106,7 +106,7 @@ void sync_global_pgds(unsigned long start, unsigned long end) > * handle synchonization on p4d level. > */ > BUILD_BUG_ON(pgd_none(*pgd_ref)); > - p4d_ref = p4d_offset(pgd_ref, address); > + p4d_ref = p4d_offset(pgd_ref, addr); > > if (p4d_none(*p4d_ref)) > continue; > @@ -117,8 +117,8 @@ void sync_global_pgds(unsigned long start, unsigned long end) > p4d_t *p4d; > spinlock_t *pgt_lock; > > - pgd = (pgd_t *)page_address(page) + pgd_index(address); > - p4d = p4d_offset(pgd, address); > + pgd = (pgd_t *)page_address(page) + pgd_index(addr); > + p4d = p4d_offset(pgd, addr); > /* the pgt_lock only for Xen */ > pgt_lock = &pgd_page_get_mm(page)->page_table_lock; > spin_lock(pgt_lock);