Re: linux-next: Tree for Jan 20 -- sparc32: fix broken set_pte()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 21, 2015 at 07:14:33PM -0800, Guenter Roeck wrote:
> On 01/21/2015 02:43 AM, Kirill A. Shutemov wrote:
> 
> >>BUG: Bad page state in process init.sh  pfn:00000
> >>page:f05e7460 count:0 mapcount:-1 mapping:  (null) index:0x0
> >>flags: 0x400(reserved)
> >>page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> >>bad because of flags:
> >>flags: 0x400(reserved)
> >>CPU: 0 PID: 1 Comm: init.sh Not tainted 3.19.0-rc5-next-20150120 #1
> >>[f0076010 : bad_page+0xdc/0xfc ] [f00760c0 : free_pages_prepare+0x90/0x1f8 ] [f00775cc : free_hot_cold_page+0x20/0x160 ] [f00919e8 : do_wp_page+0x680/0x6ac ] [f00939f4 : handle_mm_fault+0xc94/0xd08 ] [f0015900 : do_sparc_fault+0xfc/0x3ec ] [f000af90 : srmmu_fault+0x58/0x68 ] [f00e74f4 : load_elf_binary+0x9a8/0xe94 ] [f00b0cac : search_binary_handler+0x68/0x12c ] [f00e67d0 : load_script+0x214/0x224 ] [f00b0cac : search_binary_handler+0x68/0x12c ] [f00b11a4 : do_execveat_common+0x434/0x584 ] [f00b1310 : do_execve+0x1c/0x2c ] [f02b50b0 : kernel_init+0x70/0xf0 ] [f000b200 : ret_from_kernel_thread+0xc/0x38 ] [00000000 :   (null) ]
> >>Disabling lock debugging due to kernel taint
> >>: applet not found
> >>Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
> >
> >It doesn't make much sense to me. It tries to free page with pfn==0 on
> >handling wp-fault. How it got mapped in the first place?
> >
> If I comment out the added call to vm_normal_page(), the code works fine.
> If the call to vm_local_page() is there but everything else from your patch
> is commented out, the crash occurs. Also, any log message added to the new
> code patch (inside the if statements) is not getting printed, meaning the
> new code (besides the call to vm_local_page) is not reached.
> 
> I guess that means that something in the call to vm_normal_page() appears
> to go wrong. No idea what that might be, though.

vm_normal_page() is never called in this case, since prot_numa is always
zero.

I tracked the bug down. It's a sparc bug. The commit only triggers it,
because affect how GCC optimize the code around faulty point.

Please, test.

>From 5b9232753217412116a4cdc2897be0db818371ca Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
Date: Thu, 22 Jan 2015 18:42:13 +0200
Subject: [PATCH] sparc32: fix broken set_pte()

32-bit sparc uses swap instruction to implement set_pte(). It called
using GCC inline assembler. But it misses the "memory" clobber to
indicate that pte value will be updated in memory.

As result GCC doesn't know that it cannot postpone pte pointer
dereference which occurs before set_pte() to post-set_pte() time.

It leads to real-world bugs -- [1]. In this situation we have code:

	ptent = ptep_modify_prot_start(mm, addr, pte);
	ptent = pte_modify(ptent, newprot);
	...
	ptep_modify_prot_commit(mm, addr, pte, ptent);

ptep_modify_prot_start() in sparc case is just 'pte' dereference plus
pte_clear(). pte_clear() calls broken set_pte(). GCC thinks it's valid
to dereference 'pte' again on pte_modify() and gets cleared pte.
ptep_modify_prot_commit() puts 'pteent' with pfn==0 back to page table,
which eventually leads to the crash.

[1] http://lkml.kernel.org/r/54C06B19.8060305@xxxxxxxxxxxx

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx>
---
 arch/sparc/include/asm/pgtable_32.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h
index b2f7dc46a7d1..9912eb0b499a 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -102,7 +102,8 @@ extern unsigned long empty_zero_page;
  */
 static inline unsigned long srmmu_swap(unsigned long *addr, unsigned long value)
 {
-	__asm__ __volatile__("swap [%2], %0" : "=&r" (value) : "0" (value), "r" (addr));
+	__asm__ __volatile__("swap [%2], %0" :
+			"=&r" (value) : "0" (value), "r" (addr) : "memory");
 	return value;
 }
 
-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-next" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux