Re: Zoned storage support in libvirt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/30/23 21:21, Daniel P. Berrangé wrote:
> On Wed, Jan 11, 2023 at 10:24:30AM -0500, Stefan Hajnoczi wrote:
>> On Tue, Jan 10, 2023 at 03:29:47PM +0000, Daniel P. Berrangé wrote:
>>> On Tue, Jan 10, 2023 at 10:19:51AM -0500, Stefan Hajnoczi wrote:
>>>> Hi Peter,
>>>> Zoned storage support
>>>> (https://zonedstorage.io/docs/introduction/zoned-storage) is being added
>>>> to QEMU. Given a zoned host block device, the QEMU syntax will look like
>>>> this:
>>>>
>>>>   --blockdev zoned_host_device,node-name=drive0,filename=/dev/$BDEV,...
>>>>   --device virtio-blk-pci,drive=drive0
>>>>
>>>> Note that regular --blockdev host_device will not work.
>>>>
>>>> For now the virtio-blk device is the only one that supports zoned
>>>> blockdevs.
>>>
>>> Does the virtio-blk device expowsed guest ABI differ at all
>>> when connected zoned_host_device instead of host_device ?
>>
>> Yes. There is a VIRTIO feature bit, some configuration space fields,
>> etc. virtio-blk-pci detects when the blockdev is zoned and enables the
>> feature bit.
> 
> I get a general sense of unease when frontend device ABI sensitive
> features  get secretly toggled based on features exposed by the
> backend.
> 
> When trying to validate ABI compatibility of guest configs, libvirt
> would generally compare frontend properties to look for differences.
> 
> There are a small set of cases where backends affect frontend
> features, but it is not that common to see.
> 
> Consider what happens if we have a guest running no zoned storage,
> and we need to evacuate the host to a machine without zoned
> storage available. Could we replace the stroage backend on the
> target host with a raw/qcow2  backend but keep pretending it is
> zoned storage to the guest. The guest would continue making its
> I/O ops be batched for the zoned storage, which would be redundant
> for raw/qcow2, but presumbly should still work.  If this is possible
> it would suggest the need to have explicit settings for zoned storage
> on the virtio-blk frontend.  QEMU would "merely"  validate that these
> settings are turned on, if the host storage is zoned too.
> 
>>>> This brings to mind a few questions:
>>>>
>>>> 1. Does libvirt need domain XML syntax for zoned storage? Alternatively,
>>>>    it could probe /sys/block/$BDEV/queue/zoned and generate the correct
>>>>    QEMU command-line arguments for zoned devices when the contents of
>>>>    the file are not "none".
>>>>
>>>> 2. Should QEMU --blockdev host_device detected zoned devices so that
>>>>    --blockdev zoned_host_device is not necessary? That way libvirt would
>>>>    automatically support zoned storage without any domain XML syntax or
>>>>    libvirt code changes.
>>>>
>>>>    The drawbacks I see when QEMU detects zoned storage automatically:
>>>>    - You can't easiy tell if a blockdev is zoned from the command-line.
>>>>    - It's possible to mismatch zoned and non-zoned devices across live
>>>>      migration.
>>>
>>> What happens with existing QEMU impls if you use --blockdev host_device
>>> pointing to a /dev/$BDEV that is a zoned device ?  If it succeeds and
>>> works correctly, then we likely need to continue to support that. This
>>> would push towards needing a new XML element.
>>
>> Pointing host_device at a zoned device doesn't result in useful behavior
>> because the guest is unaware that this is a zoned device. The guest
>> won't be able to access the device correctly (i.e. sequential writes
>> only). Write requests will fail eventually.
>>
>> I would consider zoned devices totally unsupported in QEMU today and we
>> don't need to worry about preserving any kind of backwards compatibility
>> with --blockdev host_device,filename=/dev/my_zoned_device.
> 
> So I guess I'm not so worried about host_device vs zoned_host_device,
> if we have explicit settings for controlled zoned behaviour on the
> virtio-blk frontend.
> 
> I feel like we should have something explicit somewhere though, as this
> is a pretty significant difference in the storage stack, that I think
> mgmt apps should be aware of, as it has implications for how you manage
> the VMs on an ongoing basis.
> 
> We could still have it "do what I mean" by default though. eg the
> virtio-blk setting defaults could imply "match the host", so we get
> effectively a tri-state  (zoned=on/off/auto)

What would zoned=on mean ? If the backend is not zoned, virtio will expose a
regular block device to the guest as it should.

For zoned=auto, same, I am not sure what that would achieve. If the backend is
zoned, it will be seen as zoned by the guest. If the backend is a regular disk,
it will be exposed as a regular disk. So what would this option achieve ?

And for zoned=off, I guess you would want to ignore a backend drive if it is zoned ?

> 
> With regards,
> Daniel

-- 
Damien Le Moal
Western Digital Research





[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux