Don Zickus <dzickus at redhat.com> writes: > On Tue, Feb 07, 2012 at 03:35:59PM -0800, Eric W. Biederman wrote: >> Vivek Goyal <vgoyal at redhat.com> writes: >> >> > On Tue, Feb 07, 2012 at 04:57:41PM -0500, Don Zickus wrote: >> >> On Thu, Feb 02, 2012 at 03:24:46PM -0800, Eric W. Biederman wrote: >> >> > > Eric, brought up a point that because the boot code was restructured we may >> >> > > not need to disable the io apic any more in the crash path. The original >> >> > > concern that led to the development of disable_IO_APIC, was that the TSC >> >> > > calibration on boot up relied on the PIT timer for reference. Access >> >> > > to the PIT required 8259 interrupts to be working. This wouldn't work >> >> > > if the ioapic needed to be configured. So on panic path, the ioapic was >> >> > > reconfigured to use virtual wire mode to allow the 8259 to passthrough. >> >> > >> >> > A small clarification originally it was the jiffies calibration that >> >> > would fail if we could cause the PIT to generate interrupts through the >> >> > 8259. The boot would then hang at calibrating jiffies. >> >> >> >> Ok. Thanks! >> > >> > So now what has changed? Do we setup LAPIC and IOAPIC early enough to >> > receive PIT interrupts in regular mode (non-virtual wire mode) or >> > something else? >> >> Yes. Part of the Moorstown work required that this be done because >> moorsetown did not support legacy mode. Last I looked the code hadn't >> been generalized beyond Moorsetown but empirically it works now. >> >> Don as to what to test the only case I can think of that might be spooky >> is a screaming interrupt during the handover. You might want to try >> playing with lkcdtm to try some of the more exotic crash scenarios. But >> all I expect further testing might reveal are places where we are not >> as robust in initializing the hardware as we might be. Things that >> might have been papered over by the ioapic shutdown. > > I ran lkdtm by panic'ing in the interrupt handle thus leaving device > interrupt un-ack'd and the apic might have been un-ack'd too (jprobes > hooked in at do_IRQ). 3 out 3 times the second kernel came up on my core2 > quad. That sounds like more than enough basic testing for me. Document your testing in a patch description and let's get the unnecessary local apic and ioapic stomping removed from the kexec on panic path. There were bugs. We deleted the code that had them. The bugs are gone and there are no new problems goes over very well in my book. Eric