Re: [PATCH] vl: allow "cont" from panicked state

Luiz Capitulino <lcapitulino@xxxxxxxxxx> · Wed, 21 Aug 2013 10:11:10 -0400

On Wed, 21 Aug 2013 14:01:17 +0200
Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:

> After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
> The reason for this is that events are edge-triggered, and can be lost if
> management dies at the wrong time.  Stopping a panicked VM lets management
> know of a panic even if it has crashed; management can learn about the
> panic when it restarts and queries running QEMU processes.  The downside
> is of course that the VM will be paused while management is not running,
> but that is acceptable if it only happens with explicit "-device pvpanic".
> 
> Upon learning of a panic, management (if configured to do so) can pick a
> variety of behaviors: leave the VM paused, reset it, destroy it.  In
> addition to all of these behaviors, it is possible dumping the VM core
> from the host.
> 
> However, right now, the panicked state is irreversible, and can only be
> exited by resetting the machine.  This means that any policy decision
> is entirely in the hands of the host.  In particular there is no way to
> use the "reboot on panic" option together with pvpanic.
> 
> This patch makes the panicked state reversible (and removes various
> workarounds that were there because of the state being irreversible).
> With this change, management has a wider set of possible policies: it
> can just log the crash and leave policy to the guest, it can leave the
> VM paused.  In particular, the "log the crash and continue" is implemented
> simply by sending a "cont" as soon as management learns about the panic.
> Management could also implement the "irreversible paused state" itself.
> And again, all such actions can be coupled with dumping the VM core.
> 
> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
> it uses "-device pvpanic", management should check for "cont" failures.
> If "cont" fails, management can then log that the VM remained paused
> and urge the administrator to update QEMU.
> 
> I suggest that this patch be included in an 1.6.1 release as soon as
> possible, and perhaps in the 1.5 branch too.

Looks good to me:

Reviewed-by: Luiz Capitulino <lcapitulino@xxxxxxxxxx>

> 
> Cc: qemu-stable@xxxxxxxxxx
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
>  gdbstub.c | 3 ---
>  vl.c      | 6 ++----
>  2 files changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index 35ca7c2..747e67d 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s)
>  #ifdef CONFIG_USER_ONLY
>      s->running_state = 1;
>  #else
> -    if (runstate_check(RUN_STATE_GUEST_PANICKED)) {
> -        runstate_set(RUN_STATE_DEBUG);
> -    }
>      if (!runstate_needs_reset()) {
>          vm_start();
>      }
> diff --git a/vl.c b/vl.c
> index 25b8f2f..818d99e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>  
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
>      { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG },
>  
>      { RUN_STATE_MAX, RUN_STATE_MAX },
>  };
> @@ -685,8 +684,7 @@ int runstate_is_running(void)
>  bool runstate_needs_reset(void)
>  {
>      return runstate_check(RUN_STATE_INTERNAL_ERROR) ||
> -        runstate_check(RUN_STATE_SHUTDOWN) ||
> -        runstate_check(RUN_STATE_GUEST_PANICKED);
> +        runstate_check(RUN_STATE_SHUTDOWN);
>  }
>  
>  StatusInfo *qmp_query_status(Error **errp)

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list