Enabling core dumps is a reasonably straightforward task, but is not documented clearly. This page provides as easy link to point users to when they need to debug QEMU. Signed-off-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> --- docs/kbase/index.rst | 4 ++ docs/kbase/meson.build | 1 + docs/kbase/qemu-core-dump.rst | 132 ++++++++++++++++++++++++++++++++++ 3 files changed, 137 insertions(+) create mode 100644 docs/kbase/qemu-core-dump.rst diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst index 91083ee49d..372042886d 100644 --- a/docs/kbase/index.rst +++ b/docs/kbase/index.rst @@ -67,3 +67,7 @@ Internals / Debugging `VM migration internals <migrationinternals.html>`__ VM migration implementation details, complementing the info in `migration <../migration.html>`__ + +`Capturing core dumps for QEMU <qemu-core-dump.html>`__ + How to configure libvirt to enable capture of core dumps from + QEMU virtual machines diff --git a/docs/kbase/meson.build b/docs/kbase/meson.build index 7631b47018..6d17a83d1d 100644 --- a/docs/kbase/meson.build +++ b/docs/kbase/meson.build @@ -12,6 +12,7 @@ docs_kbase_files = [ 'locking-sanlock', 'merging_disk_image_chains', 'migrationinternals', + 'qemu-core-dump', 'qemu-passthrough-security', 'rpm-deployment', 's390_protected_virt', diff --git a/docs/kbase/qemu-core-dump.rst b/docs/kbase/qemu-core-dump.rst new file mode 100644 index 0000000000..d27f81c4d6 --- /dev/null +++ b/docs/kbase/qemu-core-dump.rst @@ -0,0 +1,132 @@ +============================= +Capturing core dumps for QEMU +============================= + +The default behaviour for a QEMU virtual machine launched by libvirt is to +have core dumps disabled. There can be times, however, when it is beneficial +to collect a core dump to enable debugging. + +QEMU driver configuration +========================= + +There is a global setting in the QEMU driver configuration file that controls +whether core dumps are permitted, and their maximum size. Enabling core dumps +is simply a matter of setting the maximum size to a non-zero value by editting +the ``/etc/libvirt/qemu.conf`` file: + +:: + + max_core = "unlimited" + +For an adhoc debugging session, setting the core dump size to "unlimited" is +viable, on the assumption that the core dumps will be disabled again once the +requisite information is collected. If the intention is to leave core dumps +permanently enabled, more careful consideration of limits is required + +Note that by default, a core dump will **NOT** include the the guest RAM +region, so will only include memory regions used by QEMU for emulation and +backend purposes. This is expected to be sufficient for the vast majority +of debugging needs. + +When there is a need to examine guest RAM though, a further setting is +available + +:: + + dump_guest_core = 1 + +This will of course result in core dumps that are as large as the biggest +virtual machine on the host - potentially 10's or even 100's of GB in size. +To allow more fine grained control it is possible to toggle this on a per +VM basis in the XML configuration. + +After changing either of the settings in ``/etc/libvirt/qemu.conf`` the daemon +hosting the QEMU driver must be restarted. For deployments using the monolithic +daemons, this means ``libvirtd``, while for those using modular daemons this +means ``virtqemud`` + +:: + + systemctl restart libvirtd (for a monolithic deployment) + systemctl restart virtqemud (for a modular deployment) + +While libvirt attempts to make it possible to restart the daemons without +negatively impacting running guests, there are some management operations +that may get interrupted. In particular long running jobs like live +migration or block device copy jobs may abort. It is thus wise to check +that the host is mostly idle before restarting the daemons. + +Guest core dump configuration +============================= + +The ``dump_guest_core`` setting mentioned above will allow guest RAM to be +included in core dumps for all virtual machines on the host. This may not +be desirable, so it is also possible to control this on a per-virtual +machine basis in the XML configuration: + +:: + + <memory dumpCore="on">...</memory> + +Note, it is still neccessary to at least set ``max_core`` to a non-zero +value in the global configuration file. + +Some management applications may not offer the ability to customimze the +XML configuration for a guest. In such situations, using the global +``dump_guest_core`` setting is the only option. + +Host OS core dump storage +========================= + +The Linux kernel default behaviour is to write core dumps to a file in the +current working directory of the process. This will not work with QEMU +processes launched by libvirt, because their working directory is ``/`` +which will not be writable. + +Most modern OS distros, however, now include systemd which configures a +custom core dump handler out of the box. When this is in effect, core dumps +from QEMU can be seen using the ``coredumpctl`` commands + +:: + + $ coredumpctl list -r + TIME PID UID GID SIG COREFILE EXE SIZE + Tue 2021-07-20 12:12:52 BST 2649303 107 107 SIGABRT present /usr/bin/qemu-system-x86_64 1.8M + ...snip... + + $ coredumpctl info 2649303 + PID: 2649303 (qemu-system-x86) + UID: 107 (qemu) + GID: 107 (qemu) + Signal: 6 (ABRT) + Timestamp: Tue 2021-07-20 12:12:52 BST (48min ago) + Command Line: /usr/bin/qemu-system-x86_64 -name guest=f30,debug-threads=on ..snip... -msg timestamp=on + Executable: /usr/bin/qemu-system-x86_64 + Control Group: /machine.slice/machine-qemu\x2d1\x2df30.scope/libvirt/emulator + Unit: machine-qemu\x2d1\x2df30.scope + Slice: machine.slice + Boot ID: 6b9015d0c05f4e7fbfe4197a2c7824a2 + Machine ID: c78c8286d6d74b22ac0dd275975f9ced + Hostname: localhost.localdomain + Storage: /var/lib/systemd/coredump/core.qemu-system-x86.107.6b9015d0c05f4e7fbfe4197a2c7824a2.2649303.1626779572000000.zst (present) + Disk Size: 1.8M + Message: Process 2649303 (qemu-system-x86) of user 107 dumped core. + + Stack trace of thread 2649303: + #0 0x00007ff3c32436be n/a (libc.so.6 + 0xf56be) + #1 0x000055a949c0ed05 qemu_poll_ns (qemu-system-x86_64 + 0x7b0d05) + #2 0x000055a949c0e476 main_loop_wait (qemu-system-x86_64 + 0x7b0476) + #3 0x000055a949a36d27 qemu_main_loop (qemu-system-x86_64 + 0x5d8d27) + #4 0x000055a94979e4d2 main (qemu-system-x86_64 + 0x3404d2) + #5 0x00007ff3c3175b75 n/a (libc.so.6 + 0x27b75) + #6 0x000055a9497a1f5e _start (qemu-system-x86_64 + 0x343f5e) + + Stack trace of thread 2649368: + #0 0x00007ff3c32435bf n/a (libc.so.6 + 0xf55bf) + #1 0x00007ff3c3af547c g_main_context_iterate.constprop.0 (libglib-2.0.so.0 + 0xa947c) + #2 0x00007ff3c3aa0a93 g_main_loop_run (libglib-2.0.so.0 + 0x54a93) + #3 0x00007ff3c17a727a red_worker_main.lto_priv.0 (libspice-server.so.1 + 0x5227a) + #4 0x00007ff3c3326299 start_thread (libpthread.so.0 + 0x9299) + #5 0x00007ff3c324e353 n/a (libc.so.6 + 0x100353) + + ...snip... -- 2.31.1