于 2012年05月21日 01:43, Avi Kivity 写道: > On 05/16/2012 10:50 AM, zhangyanfei wrote: >> This patch set exports offsets of VMCS fields as note information for >> kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve >> runtime state of guest machine image, such as registers, in host >> machine's crash dump as VMCS format. The problem is that VMCS internal >> is hidden by Intel in its specification. So, we slove this problem >> by reverse engineering implemented in this patch set. The VMCSINFO >> is exported via sysfs to kexec-tools just like VMCOREINFO. >> >> Here are two usercases for two features that we want. >> >> 1) Create guest machine's crash dumpfile from host machine's crash dumpfile >> >> In general, we want to use this feature on failure analysis for the system >> where the processing depends on the communication between host and guest >> machines to look into the system from both machines's viewpoints. >> >> As a concrete situation, consider where there's heartbeat monitoring >> feature on the guest machine's side, where we need to determine in >> which machine side the cause of heartbeat stop lies. In our actual >> experiments, we encountered such situation and we found the cause of >> the bug was in host's process schedular so guest machine's vcpu stopped >> for a long time and then led to heartbeat stop. >> >> The module that judges heartbeat stop is on guest machine, so we need >> to debug guest machine's data. But if the cause lies in host machine >> side, we need to look into host machine's crash dump. > > Do you mean, that a heartbeat failure in the guest lead to host panic? > > My expectation is that a problem in the guest will cause the guest to > panic and perhaps produce a dump; the host will remain up. > The point is that before our investigation, we didn't know which side leads to this buggy situation. Maybe a bug in host machine or the guest machine itself causes a heartbeat failure. So we want to get both host machine's crash dump and guest machine's crash dump *at the same time*. Then we could use userspace tools to get guest machine crash dump from host machine's and analyse them separately to find which side causes the problem. >> Without this feature, we first create guest machine's dump and then >> create host mahine's, but there's only a short time between two >> processings, during which it's unlikely that buggy situation remains. >> >> So, we think the feature is useful to debug both guest machine's and >> host machine's sides at the same time, and expect we can make failure >> analysis efficiently. >> >> Of course, we believe this feature is commonly useful on the situation >> where guest machine doesn't work well due to something of host machine's. >> >> 2) Get offsets of VMCS information on the CPU running on the host machine >> >> If kdump doesn't work well, then it means we cannot use kvm API to get >> register values of guest machine and they are still left on its vmcs >> region. In the case, we use crash dump mechanism running outside of >> linux kernel, such as sadump, a firmware-based crash dump. Then VMCS >> information is then necessary. > > Shouldn't sadump then expose the VMCS offsets? Perhaps bundling them > into its dump file? > Firmware-based crash dump doesn't concern the os running on the machine. So it will not do any os handling when machine crashes. Thanks Zhang Yanfei -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html