On Thu, Sep 10, 2020 at 04:26:40PM +0200, Milan Zamazal wrote: > Daniel P. Berrangé <berrange@xxxxxxxxxx> writes: > > > On Thu, Jul 02, 2020 at 01:21:15PM +0200, Milan Zamazal wrote: > >> Hi, > >> > > > >> I've met two situations with NVDIMM support in libvirt where I'm not > >> sure all the parties (libvirt & I) do the things correctly. > >> > >> The first problem is with memory alignment and size changes. In > >> addition to the size changes applied to NVDIMMs by QEMU, libvirt also > >> makes some NVDIMM size changes for better alignments, in > >> qemuDomainMemoryDeviceAlignSize. This can lead to the size being > >> rounded up, exceeding the size of the backing device and QEMU failing to > >> start the VM for that reason (I've experienced that actually). I work > >> with emulated NVDIMM devices, not a bare metal hardware, so one might > >> argue that in practice the device sizes should already be aligned, but > >> I'm not sure it must be always the case considering labels or whatever > >> else the user decides to set up. And I still don't feel very > >> comfortable that I have to count with two internal size adjustments > >> (libvirt & QEMU) to the `size' value I specify, with the ultimate goal > >> of getting the VM started and having the NVDIMM aligned properly to make > >> (non-NVDIMM) memory hot plug working. Is the size alignment performed > >> by libvirt, especially rounding up, completely correct for NVDIMMs? > > > > The comment on the function says QEMU aligns to "page size", which > > is something that can vary depending not only on architecture, and > > also the build config options for the kernel on that architecture. > > eg aarch64 has different page size in RHEL than other distros because > > of different choice of page size in kernel config. > > > > Libvirt rounds up to 1 MB, essentially so that the size works no matter > > what architecture or build options were used. I think this is quite > > compelling as I don't think mgmt apps are likely to care enough about > > non-x86 architectures to pick the right rounded sizes. > > > > If we're enforcing this 1 MB rounding though, we really should be > > documenting it clearly, so that apps can pick the right backing file > > size. I think we dropped the ball on docs. > > I still can't see it in the documentation, would it be possible to be > clear about it in the docs, please? For first, it's not very intuitive > to figure out that (if I've figured out it correctly) on POWER one > *must* specify the NVDIMM size S as > > S == aligned_size + label_size > > and that size is used for the QEMU device; while on x86_64 one can > specify any size S and > > align_up(S) > > will be used for the QEMU device (and label size doesn't influence the > value). And additional alignment may be required for having any memory > hot plug working. > > For second, and more importantly, I'm afraid that without documenting > it, future changes may break the current behavior without warning. For > example, the recent changes regarding POWER alignment in 6.7.0 are for > good IMHO and one can use the same size with both 6.7 and 6.6 versions, > but they could still cause pre-6.7 sizes stop working. I don't know what changes you are referring to here, but if they were in libvirt I'd consider that a bug - we shouldn't break a previously working configuration by increasing required alignment. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|