On Thu, Jan 08, 2015 at 02:36:38PM +0800, Jiang Liu wrote: > On 2015/1/7 23:44, Konrad Rzeszutek Wilk wrote: > > On Wed, Jan 07, 2015 at 11:37:52PM +0800, Jiang Liu wrote: > >> On 2015/1/7 22:50, Konrad Rzeszutek Wilk wrote: > >>> On Wed, Jan 07, 2015 at 02:13:49PM +0800, Jiang Liu wrote: > >>>> Commit b81975eade8c ("x86, irq: Clean up irqdomain transition code") > >>>> breaks xen IRQ allocation because xen_smp_prepare_cpus() doesn't invoke > >>>> setup_IO_APIC(), so no irqdomains created for IOAPICs and > >>>> mp_map_pin_to_irq() fails at the very beginning. > >>>> --- a/arch/x86/kernel/apic/io_apic.c > >>>> +++ b/arch/x86/kernel/apic/io_apic.c > >>>> @@ -2369,31 +2369,29 @@ static void ioapic_destroy_irqdomain(int idx) > >>>> ioapics[idx].pin_info = NULL; > >>>> } > >>>> > >>>> -void __init setup_IO_APIC(void) > >>>> +void __init setup_IO_APIC(bool xen_smp) > >>>> { > >>>> int ioapic; > >>>> > >>>> - /* > >>>> - * calling enable_IO_APIC() is moved to setup_local_APIC for BP > >>>> - */ > >>>> - io_apic_irqs = nr_legacy_irqs() ? ~PIC_IRQS : ~0UL; > >>>> + if (!xen_smp) { > >>>> + apic_printk(APIC_VERBOSE, "ENABLING IO-APIC IRQs\n"); > >>>> + io_apic_irqs = nr_legacy_irqs() ? ~PIC_IRQS : ~0UL; > >>>> + > >>>> + /* Set up IO-APIC IRQ routing. */ > >>>> + x86_init.mpparse.setup_ioapic_ids(); > >>>> + sync_Arb_IDs(); > >>>> + } > Hi Konrad, > Enabling above code for Xen dom0 will cause following warning > because it writes a special value to ICR register. > [ 3.394981] ------------[ cut here ]------------ > [ 3.394985] WARNING: CPU: 0 PID: 1 at arch/x86/xen/enlighten.c:968 > xen_apic_write+0x15/0x20() > [ 3.394988] Modules linked in: > [ 3.394991] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-rc3+ #5 > [ 3.394993] Hardware name: Dell Inc. OptiPlex 9020/0DNKMN, BIOS A03 > 09/17/2013 > [ 3.394996] 00000000000003c8 ffff88003056bdd8 ffffffff817611bb > 00000000000003c8 > [ 3.395000] 0000000000000000 ffff88003056be18 ffffffff8106f4ea > 0000000000000008 > [ 3.395004] ffffffff81fc1120 ffff880030561348 000000000000a108 > 000000000000a101 > [ 3.395008] Call Trace: > [ 3.395012] [<ffffffff817611bb>] dump_stack+0x4f/0x6c > [ 3.395015] [<ffffffff8106f4ea>] warn_slowpath_common+0xaa/0xd0 > [ 3.395018] [<ffffffff8106f525>] warn_slowpath_null+0x15/0x20 > [ 3.395021] [<ffffffff81003e25>] xen_apic_write+0x15/0x20 > [ 3.395026] [<ffffffff81ef606b>] sync_Arb_IDs+0x84/0x86 > [ 3.395029] [<ffffffff81ef7f7a>] setup_IO_APIC+0x7f/0x8e3 > [ 3.395033] [<ffffffff810b275d>] ? trace_hardirqs_on+0xd/0x10 > [ 3.395036] [<ffffffff8176858a>] ? _raw_spin_unlock_irqrestore+0x8a/0xa0 > [ 3.395040] [<ffffffff81ee841b>] xen_smp_prepare_cpus+0x5d/0x184 > [ 3.395044] [<ffffffff81ee1ba3>] kernel_init_freeable+0x149/0x293 > [ 3.395047] [<ffffffff81758d49>] ? kernel_init+0x9/0xf0 > [ 3.395049] [<ffffffff81758d40>] ? rest_init+0xd0/0xd0 > [ 3.395052] [<ffffffff81758d49>] kernel_init+0x9/0xf0 > [ 3.395054] [<ffffffff8176887c>] ret_from_fork+0x7c/0xb0 > [ 3.395057] [<ffffffff81758d40>] ? rest_init+0xd0/0xd0 > [ 3.395066] ---[ end trace 7c4371c8ba33d5d0 ]--- > > <snit> > >>>> ioapic_initialized = 1; > >>>> + > >>>> + if (!xen_smp) { > >>>> + init_IO_APIC_traps(); > >>>> + if (nr_legacy_irqs()) > >>>> + check_timer(); > >>>> + } > >>>> } > And enabling above code causes Xen dom0 reboots. Which is due to the 'check_timer' trying to setup its timer and failing and then moving under its feet the legacy_pic to the NULL one and then hitting panic. The 'check_timer' has the logic to swap the 'legacy_pic': 2186 legacy_pic->init(1); which ends up executing: 317 new_val = inb(PIC_MASTER_IMR); 318 if (new_val != probe_val) { 319 printk(KERN_INFO "Using NULL legacy PIC\n"); 320 legacy_pic = &null_legacy_pic; 321 raw_spin_unlock_irqrestore(&i8259A_lock, flags); 322 return; 323 } And the 'legacy_pic' has now be swapped over to the 'null_legacy_pic' for which: 2393 if (nr_legacy_irqs()) 2394 check_timer(); 2395 70 static inline int nr_legacy_irqs(void) 71 { 72 return legacy_pic->nr_legacy_irqs; 73 } 74 would return zero (and not invoke the 'check_timer'), but because we do make the check inside the 'check_timer' we continue on. Perhaps something like this? diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 3f5f604..e474389 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2184,6 +2184,14 @@ static inline void __init check_timer(void) */ apic_write(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_EXTINT); legacy_pic->init(1); + /* + * The init swapped out the legacy_pic to point to the NULL one. + * As such we should not even have entered this init routine + * (which depends on ->nr_legacy_irqs having an non-zero value + * and null_legacy_pic has zero. + */ + if (legacy_pic == null_legacy_pic) + goto out; pin1 = find_isa_irq_pin(0, mp_INT); apic1 = find_isa_irq_apic(0, mp_INT); diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 4c071ae..9f404df 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -327,6 +327,7 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus) xen_raw_printk(m); panic(m); } + setup_IO_APIC(); xen_init_lock_cpu(0); smp_store_boot_cpu_info(); The patch of course ignores the WARN which woudl need something else. > Haven't test HVM and PV kernel yet. > So seems we still need special treatment for xen here. > Regards! > Gerry > > >>>> > >>>> /* > >>>> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > >>>> index 4c071aeb8417..7eb0283901fa 100644 > >>>> --- a/arch/x86/xen/smp.c > >>>> +++ b/arch/x86/xen/smp.c > >>>> @@ -326,7 +326,10 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus) > >>>> > >>>> xen_raw_printk(m); > >>>> panic(m); > >>>> + } else { > >>>> + setup_IO_APIC(true); > >>>> } > >>>> + > >>>> xen_init_lock_cpu(0); > >>>> > >>>> smp_store_boot_cpu_info(); > >>>> -- > >>>> 1.7.10.4 > >>>> > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >>> the body of a message to majordomo@xxxxxxxxxxxxxxx > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> Please read the FAQ at http://www.tux.org/lkml/ > >>> > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html