On Mon, Aug 25, 2008 at 11:56:11AM -0400, Mike Snitzer wrote: > On Thu, Jul 24, 2008 at 9:12 PM, Vivek Goyal <vgoyal at redhat.com> wrote: > > On Thu, Jul 24, 2008 at 03:03:33PM -0400, Mike Snitzer wrote: > >> On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <vgoyal at redhat.com> wrote: > >> > On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote: > >> >> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <agraf at suse.de> wrote: > >> > >> >> > As you're stating that the host kernel breaks with kvm modules loaded, maybe > >> >> > someone there could give a hint. > >> >> > >> >> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to > >> >> see how kexec/kdump of the host fairs when kvm modules are loaded. > >> >> > >> >> On the guest side of things, as I mentioned in my original post, > >> >> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host > >> >> running 2.6.25.4 (with kvm-70). > >> >> > >> > > >> > Hi Mike, > >> > > >> > I have never tried kexec/kdump inside a kvm guest. So I don't know if > >> > historically they have been working or not. > >> > >> Avi indicated he seems to remember that at least kexec worked last he > >> tried (didn't provide when/what he tried though). > >> > >> > Having said that, Why do we need kdump to work inside the guest? In this > >> > case qemu should be knowing about the memory of guest kernel and should > >> > be able to capture a kernel crash dump? I am not sure if qemu already does > >> > that. If not, then probably we should think about it? > >> > > >> > To me, kdump is a good solution for baremetal but not for virtualized > >> > environment where we already have another piece of software running which > >> > can do the job for us. We will end up wasting memory in every instance > >> > of guest (memory reserved for kdump kernel in every guest). > >> > >> I haven't looked into what mechanics qemu provides for collecting the > >> entire guest memory image; I'll dig deeper at some point. It seems > >> the libvirt mid-layer ("virsh dump" - dump the core of a domain to a > >> file for analysis) doesn't support saving a kvm guest core: > >> # virsh dump guest10 guest10.dump > >> libvir: error : this function is not supported by the hypervisor: > >> virDomainCoreDump > >> error: Failed to core dump domain guest10 to guest10.dump > >> > >> Seems that libvirt functionality isn't available yet with kvm (I'm > >> using libvirt 0.4.2, I'll give libvirt 0.4.4 a try). cc'ing the > >> libvirt-list to get their insight. > >> > >> That aside, having the crash dump collection be multi-phased really > >> isn't workable (that is if it requires a crashed guest to be manually > >> saved after the fact). The host system _could_ be rebooted; whereby > >> losing the guest's core image. So automating qemu and/or libvirtd to > >> trigger a dump would seem worthwhile (maybe its already done?). > >> > > > > That's a good point. Ideally, one would like dump to be captured > > automatically if kernel crashes and then reboot back to production > > kernel. I am not sure what can we do to let qemu know after crash > > so that it can automatically save dump. > > > > What happens in the case of xen guests. Is dump automatically captured > > or one has to force the dump capture externally. > > > >> So while I agree with you its ideal to not have to waste memory in > >> each guest for the purposes of kdump; if users want to model a guest > >> image as closely as possible to what will be deployed on bare metal it > >> really would be ideal to support a 1:1 functional equivalent with kvm. > > > > Agreed. Making kdump work inside kvm guest does not harm. > > > >> I work with people who refuse to use kvm because of the lack of > >> kexec/kdump support. > >> > > > > Interesting. > > > >> I can do further research but welcome others' insight: do others have > >> advice on how best to collect a crashed kvm guest's core? > >> > >> > It will be interesting to look at your results with 2.6.25.x kernels with > >> > kvm module inserted. Currently I can't think what can possibly be wrong. > >> > >> If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules > >> loaded kexec/kdump does _not_ work (simply hangs the system). If I > >> only have the kvm module loaded kexec/kdump works as expected > >> (likewise if no kvm modules are loaded at all). So it would appear > >> that kvm-intel and kexec are definitely mutually exclusive at the > >> moment (at least on both 2.6.22.x and 2.6.25.x). > > > > Ok. So first task is to fix host kexec/kdump with kvm-intel module > > inserted. > > > > Can you do little debugging to find out where system hangs. I generally > > try few things for kexec related issue debugging. > > > > 1. Specify earlyprintk= parameter for second kernel and see if control > > is reaching to second kernel. > > > > 2. Otherwise specify --console-serial parameter on "kexec -l" commandline > > and it should display a message "I am in purgatory" on serial console. > > This will just mean that control has reached at least till purgatory. > > > > 3. If that also does not work, then most likely first kernel itself got > > stuck somewhere and we need to put some printks in first kernel to find > > out what's wrong. > > Vivek, > > I've been unable to put time to chasing this (and I'm not seeing when > I'll be able to get back to it yet). I hope that others will be > willing to take a look before me. > > The kvm-intel and kexec incompatibility issue is not exclusive to my > local environment (simply need a cpu that supports kvm-intel). > Thanks Mike. Let me see if I get some free cycles to debug it. Thanks Vivek