Daniel P. Berrangé <berrange@xxxxxxxxxx> writes: > On Thu, Sep 10, 2020 at 04:26:40PM +0200, Milan Zamazal wrote: >> Daniel P. Berrangé <berrange@xxxxxxxxxx> writes: >> > >> > On Thu, Jul 02, 2020 at 01:21:15PM +0200, Milan Zamazal wrote: >> >> Hi, >> >> >> > >> >> I've met two situations with NVDIMM support in libvirt where I'm not >> >> sure all the parties (libvirt & I) do the things correctly. >> >> >> >> The first problem is with memory alignment and size changes. In >> >> addition to the size changes applied to NVDIMMs by QEMU, libvirt also >> >> makes some NVDIMM size changes for better alignments, in >> >> qemuDomainMemoryDeviceAlignSize. This can lead to the size being >> >> rounded up, exceeding the size of the backing device and QEMU failing to >> >> start the VM for that reason (I've experienced that actually). I work >> >> with emulated NVDIMM devices, not a bare metal hardware, so one might >> >> argue that in practice the device sizes should already be aligned, but >> >> I'm not sure it must be always the case considering labels or whatever >> >> else the user decides to set up. And I still don't feel very >> >> comfortable that I have to count with two internal size adjustments >> >> (libvirt & QEMU) to the `size' value I specify, with the ultimate goal >> >> of getting the VM started and having the NVDIMM aligned properly to make >> >> (non-NVDIMM) memory hot plug working. Is the size alignment performed >> >> by libvirt, especially rounding up, completely correct for NVDIMMs? >> > >> > The comment on the function says QEMU aligns to "page size", which >> > is something that can vary depending not only on architecture, and >> > also the build config options for the kernel on that architecture. >> > eg aarch64 has different page size in RHEL than other distros because >> > of different choice of page size in kernel config. >> > >> > Libvirt rounds up to 1 MB, essentially so that the size works no matter >> > what architecture or build options were used. I think this is quite >> > compelling as I don't think mgmt apps are likely to care enough about >> > non-x86 architectures to pick the right rounded sizes. >> > >> > If we're enforcing this 1 MB rounding though, we really should be >> > documenting it clearly, so that apps can pick the right backing file >> > size. I think we dropped the ball on docs. >> >> I still can't see it in the documentation, would it be possible to be >> clear about it in the docs, please? For first, it's not very intuitive >> to figure out that (if I've figured out it correctly) on POWER one >> *must* specify the NVDIMM size S as >> >> S == aligned_size + label_size >> >> and that size is used for the QEMU device; while on x86_64 one can >> specify any size S and >> >> align_up(S) >> >> will be used for the QEMU device (and label size doesn't influence the >> value). And additional alignment may be required for having any memory >> hot plug working. >> >> For second, and more importantly, I'm afraid that without documenting >> it, future changes may break the current behavior without warning. For >> example, the recent changes regarding POWER alignment in 6.7.0 are for >> good IMHO and one can use the same size with both 6.7 and 6.6 versions, >> but they could still cause pre-6.7 sizes stop working. > > I don't know what changes you are referring to here, but if they were > in libvirt I'd consider that a bug - we shouldn't break a previously > working configuration by increasing required alignment. I mean disabling the auto alignment in https://gitlab.com/libvirt/libvirt/-/commit/07de813924caf37e535855541c0c1183d9d382e2 and replacing it with validation in https://gitlab.com/libvirt/libvirt/-/commit/0ccceaa57c50e5ee528f7073fa8723afd62b88b7 That change can cause a VM fail to start but after (manually) adjusting the device size, all should work all right. Changes that would actually change sizes would be more dangerous. Regards, Milan