On Wed, Feb 10, 2021 at 01:57:42PM -0500, John David Anglin wrote: > On 2021-02-10 12:23 p.m., Helge Deller wrote: > > On 2/10/21 3:52 PM, Guenter Roeck wrote: > >> On Wed, Jan 27, 2021 at 10:18:51PM +0100, Helge Deller wrote: > >>> On parisc a spinlock is stored in the next page behind the pgd which > >>> protects against parallel accesses to the pgd. That's why one additional > >>> page (PGD_ALLOC_ORDER) is allocated for the pgd. > >>> > >>> Matthew Wilcox suggested that we instead should use a pointer in the > >>> struct page table for this spinlock and noted, that the comments for the > >>> PGD_ORDER and PMD_ORDER defines were wrong. > >>> > >>> Both suggestions are addressed in this patch. The pgd spinlock > >>> (parisc_pgd_lock) is stored in the struct page table. In > >>> switch_mm_irqs_off() the physical address of this lock is loaded into > >>> cr28 (tr4) and the pgd into cr25, so that the fault handlers can > >>> directly access the lock. > >>> > >>> The currently implemened Hybrid L2/L3 page table scheme (where the pmd > >>> is adjacent to the pgd) is dropped now too. > >>> > >>> Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx> > >>> Fixes: b37d1c1898b2 ("parisc: Use per-pagetable spinlock") > >>> Signed-off-by: Helge Deller <deller@xxxxxx> > >>> Signed-off-by: John David Anglin <dave.anglin@xxxxxxxx> > >> > >> This patch results in: > >> > >> BUG: spinlock recursion on CPU#0, swapper/0/1 > >> lock: 0x12226d14, .magic: dead4ead, .owner: swapper/0/1, .owner_cpu: 0 > >> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.11.0-rc7-next-20210209-32bit #1 > >> Hardware name: 9000/778/B160L > >> Backtrace: > >> [<1019f9bc>] show_stack+0x34/0x48 > >> [<10a65278>] dump_stack+0x94/0x114 > >> [<10219f4c>] spin_dump+0x8c/0xb8 > >> [<1021a0b4>] do_raw_spin_lock+0xdc/0x108 > >> [<10a7367c>] _raw_spin_lock_irqsave+0x30/0x48 > >> [<102bf41c>] handle_mm_fault+0x5e8/0xdb0 > >> [<102b813c>] __get_user_pages.part.0+0x1b0/0x3d4 > >> [<102b8900>] __get_user_pages_remote+0x134/0x34c > >> [<102b8b80>] get_user_pages_remote+0x68/0x90 > >> [<102fccb0>] get_arg_page+0x94/0xd8 > >> [<102fdd84>] copy_string_kernel+0xc4/0x234 > >> [<102fe70c>] kernel_execve+0xcc/0x1a4 > >> [<10a58d94>] run_init_process+0xbc/0xe0 > >> [<10a70d50>] kernel_init+0x98/0x13c > >> [<1019a01c>] ret_from_kernel_thread+0x1c/0x24 > >> > >> when trying to boot parisc/hppa images in qemu. Reverting this patch fixes > >> the problem. > > > > True, I can reproduce the problem. > > With CONFIG_DEBUG_SPINLOCK=y you get the backtrace above. > > With CONFIG_DEBUG_SPINLOCK=n it just hangs. > > Happenes with 32-bit kernel with SMP kernel, even if only one virtual CPU is started. > Which is quite puzzling since most spin locks are optimized in this case case. Strikes me we > have a lock that's not initialized. > > It's not obvious how this relates to the patch. get_arg_page() calls get_user_pages_remote() with The fact that reverting it fixes the problem is a good indication that the problem does relate to this patch. As for how - no idea. That is not my area of expertise. Thanks, Guenter