I tried nr_cpus=1, but it didn't help. I haven't tried the test on bare metal, but I'm going to try it on my laptop later today. I'll let you know if it works. Thanks, Dmitry -----Original Message----- From: Vivek Goyal [mailto:vgoyal@xxxxxxxxxx] Sent: Monday, August 15, 2011 6:41 PM To: Krivenok, Dmitry Cc: Kexec Mailing List Subject: Re: kdump and SMP system kernel On Mon, Aug 15, 2011 at 10:27:30AM -0400, dmitry.krivenok at emc.com wrote: > Hello Vivek and Maneesh, > I've read your document Documentation/kdump/kdump.txt and built system and dump-capture > kernels with the options mentioned there. > > Then I booted the new system kernel and registered a "panic handler" using the following command > kexec -p /boot/linux-3.0.0-capture --initrd=/boot/initrd-3.0.0-capture --append="root=/dev/mapper/myvg-root 3 irqpoll maxcpus=1 reset_devices" > > Finally, I simulated a panic using > echo c > /proc/sysrq-trigger > > Unfortunately, the dump-capture kernel wasn't functional (it was booting very slowly, I saw lots of messages > like "ata2: lost interrupt", my keyboard didn't work at all and I couldn't access the system via the network). > > I investigated this problem and tried lots of combinations of boot parameters for dump-capture kernel, but > nothing helped. Then I tried to tune boot parameters of system kernel and found that if I specify "maxcpus=1" > for system kernel, then dump-capture kernel always boots successfully and I have access to correct /proc/vmcore. > > The problem is that I'm debugging a problem which only occurs on SMP kernel and I never see it on the kernel > booted with "maxcpus=1". > > So I just want to clarify - is it possible to use kexec/kdump with SMP system kernel? > Is it intended to work at all? > > Thanks in advance! > > P.S. > I'm using Arch Linux with vanilla kernel 3.0.0 and kexec-tools 2.0.2-3 running in VM on VmWare ESX server. Yes it is supposed to work on SMP machines. maxcpus=1 in second kernel will make sure that it brings up only the cpu we crashed on. You can also try using nr_cpus=1 on latest kernels. It sounds like an issue with disk driver initialization and could have something do to with hypervisor also. Not sure. Does it work on bare metal. P.S. Maneesh is no more with IBM so above id is not valid. I am not sure what's new id. Some of these issues you can copy on kexec-tools mailing list. I am ccing the list now. Thanks Vivek