On Fri, Jun 24, 2016 at 1:51 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: > On Fri, Jun 24, 2016 at 03:25:30PM -0500, Josh Poimboeuf wrote: >> On Fri, Jun 24, 2016 at 11:11:47AM -0700, Linus Torvalds wrote: >> > On Fri, Jun 24, 2016 at 10:51 AM, Linus Torvalds >> > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> > > >> > > And in particular, the init_task stack initialization initialized it >> > > to the init_thread pointer. Which was definitely deadly. >> > > >> > > Let's see if that was it.. >> > >> > No, it's still broken. But it's *less* broken, so here's a new version >> > of the patch that at least gets some of the stack setup right, in my >> > hope that somebody will bother to look at this, and being less broken >> > might mean that somebody sees what else I missed.. >> >> I found at least one bug. The changing of task->stack from a "void *" to an >> "unsigned long *": >> >> > - void *stack; >> > + unsigned long *stack; >> >> That subtly changes the pointer arithmetic in do_boot_cpu(): >> >> >> idle->thread.sp = (unsigned long) (((struct pt_regs *) >> (THREAD_SIZE + task_stack_page(idle))) - 1); >> >> >> That ends up adding 128k to the stack page bottom instead of 16k. >> >> But fixing that doesn't seem to fix this: >> >> [18446743832.576241] ------------[ cut here ]------------ >> [18446743832.576241] WARNING: CPU: 1 PID: 0 at /home/jpoimboe/git/linux/arch/x86/kernel/cpu/common.c:1434 cpu_init+0x34b/0x440 >> [18446743832.576241] Modules linked in: >> [18446743832.576241] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc4+ #47 >> [18446743832.576241] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 >> [18446743832.576241] 0000000000000086 574e5e6c6855ace9 ffff88007c553e88 ffffffff8143cb83 >> [18446743832.576241] 0000000000000000 0000000000000000 ffff88007c553ec8 ffffffff810b0e7b >> [18446743832.576241] 0000059a00000000 0000000000000000 0000000000000000 0000000000000000 >> [18446743832.576241] Call Trace: >> [18446743832.576241] [<ffffffff8143cb83>] dump_stack+0x85/0xc2 >> [18446743832.576241] [<ffffffff810b0e7b>] __warn+0xcb/0xf0 >> [18446743832.576241] [<ffffffff810b0fad>] warn_slowpath_null+0x1d/0x20 >> [18446743832.576241] [<ffffffff810491bb>] cpu_init+0x34b/0x440 >> [18446743832.576241] [<ffffffff8105ab7c>] start_secondary+0x1c/0x1a0 >> [18446743832.576241] ---[ end trace 924d57afbaca0720 ]--- >> >> So there's at least another bug lurking.. > > Found another bug: > > #define stack_smp_processor_id() \ > ({ \ > struct thread_info *ti; \ > __asm__("andq %%rsp,%0; ":"=r" (ti) : "0" (CURRENT_MASK)); \ > ti->cpu; \ > }) > > That macro is obviously no longer valid. > > That seems to cause the above warning. When trying to boot CPU 1, > cpu_init() calls the above macro which incorrectly returns 0. Fixed in my queue by removing the function: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=01b1a4b6fd629820625b64ca6e17c987f2ee8c09 -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html