Re: [PATCH 0/9] vhost-scsi: log write descriptors for live migration (and two bugfix)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks to the suggestion from Mike, I am going re-send v2 with:

1. Re-base on top of the below patchset.

[PATCH v2 0/8] vhost-scsi: Memory reduction patches
https://yhbt.net/lore/target-devel/20241203191705.19431-1-michael.christie@xxxxxxxxxx/

The patchset can clean apply/build on top of the commit 87a132e73910
("Merge tag 'mm-hotfixes-stable-2025-02-19-17-49' of
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm").


2. Don't allocate all per-cmd log buffer until VHOST_F_LOG_ALL is set.

Either to take advantage of vhost_scsi_set_features(), or follow the idea
of below patch.

[PATCH v2 5/8] vhost-scsi: Dynamically allocate scatterlists
https://yhbt.net/lore/target-devel/20241203191705.19431-6-michael.christie@xxxxxxxxxx/

Thank you very much!

Dongli Zhang

On 2/7/25 10:41 AM, Dongli Zhang wrote:
> The live migration with vhost-scsi has been enabled by QEMU commit
> b3e89c941a85 ("vhost-scsi: Allow user to enable migration"), which
> thoroughly explains the workflow that QEMU collaborates with vhost-scsi on
> the live migration.
> 
> Although it logs dirty data for the used ring, it doesn't log any write
> descriptor (VRING_DESC_F_WRITE).
> 
> In comparison, vhost-net logs write descriptors via vhost_log_write(). The
> SPDK (vhost-user-scsi backend) also logs write descriptors via
> vhost_log_req_desc().
> 
> As a result, there is likely data mismatch between memory and vhost-scsi
> disk during the live migration.
> 
> 1. Suppose there is high workload and high memory usage. Suppose some
> systemd userspace pages are swapped out to the swap disk.
> 
> 2. Upon request from systemd, the kernel reads some pages from the swap
> disk to the memory via vhost-scsi.
> 
> 3. Although those userspace pages' data are updated, they are not marked as
> dirty by vhost-scsi (this is the bug). They are not going to migrate to the
> target host during memory transfer iterations.
> 
> 4. Suppose systemd doesn't write to those pages any longer. Those pages
> never get the chance to be dirty or migrated any longer.
> 
> 5. Once the guest VM is resumed on the target host, because of the lack of
> those dirty pages' data, the systemd may run into abnormal status, i.e.,
> there may be systemd segfault.
> 
> Log all write descriptors to fix the issue.
> 
> In addition, the patchset also fixes two bugs in vhost-scsi.
> 
> Dongli Zhang (log descriptor, suggested by Joao Martins):
>   vhost: modify vhost_log_write() for broader users
>   vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors
>   vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd
>   vhost-scsi: log I/O queue write descriptors
>   vhost-scsi: log control queue write descriptors
>   vhost-scsi: log event queue write descriptors
>   vhost: add WARNING if log_num is more than limit
> 
> Dongli Zhang (vhost-scsi bugfix):
>   vhost-scsi: protect vq->log_used with vq->mutex
>   vhost-scsi: Fix vhost_scsi_send_bad_target()
> 
>  drivers/vhost/net.c   |   2 +-
>  drivers/vhost/scsi.c  | 191 +++++++++++++++++++++++++++++++++++++++------
>  drivers/vhost/vhost.c |  46 ++++++++---
>  drivers/vhost/vhost.h |   2 +-
>  4 files changed, 206 insertions(+), 35 deletions(-)
> 
> 
> base-commit: 5c8c229261f14159b54b9a32f12e5fa89d88b905
> 
> Thank you very much!
> 
> Dongli Zhang
> 
> 





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux