On Wed, May 02, 2012 at 12:39:06PM -0700, Eric W. Biederman wrote: > Seiji Aguchi <seiji.aguchi at hds.com> writes: > > >> Perhaps calling setup_IO_APIC before setup_local_APIC would be a better fix? > > > > I checked Intel develper's manual and there is no restriction about the order of enabling IO_APIC/local_APIC. > > So, it may work. > > > > But, I don't understand why we have to change the stable boot-up code. > > Because the boot-up code is buggy. We need to get a better handle on > how it is buggy but apparently an interrupt coming in at the wrong > moment while booting with interrupts on the interrupt flag on the cpus > disalbed puts us in a state where we fail to boot. > > We should be able to boot with apics enabled, and we almost can > emperically there are a few bugs. > > The kdump path is particularly good at finding bugs. > > > If kdump disables both local_apic and IO_APIC in proper way in 1st kernel, 2nd kernel works without any change. > > We can not guarnatee disabling the local apics in the first kernel. > > Ultimately the less we do in the first kernel the more reliable kdump is > going to be. Disabling the apics has been a long standing bug work > around. > > At worst we may have been a smidge premature in using assuming the > kernel can boot with the apics enabled but it I would hope we can > track down and fix the boot up code. > > Probably what we want to do is not to disable the I/O apics but > to program the I/O apics before we enable the local apic so that > we have control of the in-comming interrupts. But I haven't > looked at this in nearly enough detail to even guess what needs > to happen. Hi Eric, Thanks for the info. I have don't have a problem with what you say above, I think that is a noble effort worth pursuing. From a high level perspective, I am trying to understand how that is supposed to be acheived. Getting the code to match the theory is probably easier to do than throw random patches/hacks at various kdump problems as they arise. So can I understand what your thoughts are? Are you expecting the following in the first kernel: panic disable other cpus setup 2nd kernel jumptables disable panic cpu interrupts idt/gdt settings?? jump to purgatory (this leaves apics and virt stuff untouched?) (i am ignoring nmi/mce/faults and other exceptions for now) purgatory stuff... 2nd kernel: normal early boot stuff setup memory setup scheduler ... program ioapic/lapic?? #currently this is down _after_ boot cpu interrupts are enabled #which seem problematic if you have leftover screaming interrupts #probably a reason for this like timers or something enable boot cpu interrupts setup boot cpu setup other cpus .... Cheers, Don