On Wed, Aug 21, 2013 at 05:32:27PM +0200, Paolo Bonzini wrote: > Il 21/08/2013 17:23, Eric Blake ha scritto: > >> Upon learning of a panic, management (if configured to do so) can pick a > >> variety of behaviors: leave the VM paused, reset it, destroy it. In > >> addition to all of these behaviors, it is possible dumping the VM core > >> from the host. > > > > s/possible dumping/possible to dump/ > > > > and yes, libvirt wants to do just that, as one of its <on_crash> > > mappings, since it could do the same for Xen. > > > >> > >> However, right now, the panicked state is irreversible, and can only be > >> exited by resetting the machine. This means that any policy decision > >> is entirely in the hands of the host. In particular there is no way to > >> use the "reboot on panic" option together with pvpanic. > >> > >> This patch makes the panicked state reversible (and removes various > >> workarounds that were there because of the state being irreversible). > >> With this change, management has a wider set of possible policies: it > >> can just log the crash and leave policy to the guest, it can leave the > >> VM paused. In particular, the "log the crash and continue" is implemented > >> simply by sending a "cont" as soon as management learns about the panic. > >> Management could also implement the "irreversible paused state" itself. > >> And again, all such actions can be coupled with dumping the VM core. > > > > Yes, this makes sense. > > > >> > >> Unfortunately we cannot change the behavior of 1.6.0. Thus, even if > >> it uses "-device pvpanic", management should check for "cont" failures. > >> If "cont" fails, management can then log that the VM remained paused > >> and urge the administrator to update QEMU. > > > > Is that the best we can do? Is there any sort of QMP introspection that > > libvirt can do, where we can know UP FRONT what level of panic support > > is provided by the qemu binary and the machine type being run in that > > binary? > > No, this is not possible unfortunately. The only possibility that comes > to mind would be to rename the pvpanic device, e.g. to "isa-pvpanic", > and forget about "-device pvpanic" on 1.6.x. A hack, I know. > > To support 1.5, libvirt should simply be ready to react to unanticipated > GUEST_PANICKED events. reboot-on-panic will simply be broken for 1.5 > and Linux 3.10+ guests. :( Let's just fix the bugs in 1.6.X. I don't think libvirt needs to work around all qemu bugs. For 1.5.X it might be possible to backport -device pvpanic there. We need to make sure cross-version migration works. > >> +++ b/vl.c > >> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = { > >> { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING }, > >> { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE }, > >> > >> - { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED }, > >> + { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING }, > > > > Is 'cont' the only viable way to escape PANICKED, or is it also > > reasonable to support 'stop' as a way to transition from PANICKED to > > PAUSED? That is, management may want to make the state reversible but > > still leave the guest paused, so this patch may be incomplete. > > No, there is no way to move from PANICKED to PAUSED. Libvirt has its > own statuses (PAUSED, CRASHED etc.) and substatuses. You don't really > care about the QEMU state: both the PAUSED_PANICKED and CRASHED_PANICKED > substatuses map to QEMU's GUEST_PANICKED state. Simply, libvirt will > not allow a "virsh resume" for <on_crash>preserve</on_crash>, and will > allow it for a hypothetical new <on_crash>pause</on_crash> element. > > BTW, any chance "coredump-destroy" and "coredump-restart" can be > preserved just for backwards compatibility, and a new coredump='yes/no' > attribute introduced instead? Because coredump-pause and > coredump-preserve would make just as much sense. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list