On Thu, 29 Oct 2015 16:16:57 -0200 Eduardo Habkost <ehabkost@xxxxxxxxxx> wrote: > (CCing Michal and libvir-list, so libvirt team is aware of this > restriction) > > On Thu, Oct 29, 2015 at 02:36:37PM +0100, Igor Mammedov wrote: > > On Tue, 27 Oct 2015 14:36:35 -0200 > > Eduardo Habkost <ehabkost@xxxxxxxxxx> wrote: > > > > > On Tue, Oct 27, 2015 at 10:14:56AM +0100, Igor Mammedov wrote: > > > > On Tue, 27 Oct 2015 10:53:08 +0200 > > > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > > > > > > > On Tue, Oct 27, 2015 at 09:48:37AM +0100, Igor Mammedov wrote: > > > > > > On Tue, 27 Oct 2015 10:31:21 +0200 > > > > > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > > > > > > > > > > > On Mon, Oct 26, 2015 at 02:24:32PM +0100, Igor Mammedov wrote: > > > > > > > > Yep it's workaround but it works around QEMU's broken virtio > > > > > > > > implementation in a simple way without need for guest side changes. > > > > > > > > > > > > > > > > Without foreseeable virtio fix it makes memory hotplug unusable and even > > > > > > > > more so if there were a virtio fix it won't fix old guests since you've > > > > > > > > said that virtio fix would require changes of both QEMU and guest sides. > > > > > > > > > > > > > > What makes it not foreseeable? > > > > > > > Apparently only the fact that we have a work-around in place so no one > > > > > > > works on it. I can code it up pretty quickly, but I'm flat out of time > > > > > > > for testing as I'm going on vacation soon, and hard freeze is pretty > > > > > > > close. > > > > > > I can lend a hand for testing part. > > > > > > > > > > > > > > > > > > > > GPA space is kind of cheap, but wasting it in chunks of 512M > > > > > > > seems way too aggressive. > > > > > > hotplug region is sized with 1Gb alignment reserve per DIMM so we aren't > > > > > > actually wasting anything here. > > > > > > > > > > > > > > > > If I allocate two 1G DIMMs, what will be the gap size? 512M? 1G? > > > > > It's too much either way. > > > > minimum would be 512, and if backend is 1Gb-hugepage gap will be > > > > backend's natural alignment (i.e. 1Gb). > > > > > > Is backend configuration even allowed to affect the machine ABI? We need > > > to be able to change backend configuration when migrating the VM to > > > another host. > > for now, one has to use the same type of backend on both sides > > i.e. if source uses 1Gb huge pages backend then target also > > need to use it. > > > > The page size of the backend don't even depend on QEMU arguments, but on > the kernel command-line or hugetlbfs mount options. So it's possible to > have exactly the same QEMU command-line on source and destination (with > an explicit versioned machine-type), and get a VM that can't be > migrated? That means we are breaking our guarantees about migration and > guest ABI. > > > > We could change this for the next machine type to always force > > max alignment (1Gb), then it would be possible to change > > between backends with different alignments. > > I'm not sure what's the best solution here. If always using 1GB is too > aggressive, we could require management to ask for an explicit alignment > as a -machine option if they know they will need a specific backend page > size. > > BTW, are you talking about the behavior introduced by > aa8580cddf011e8cedcf87f7a0fdea7549fc4704 ("pc: memhp: force gaps between > DIMM's GPA") only, or the backend page size was already affecting GPA > allocation before that commit? backend alignment was there since beginning, we always over-reserve 1GB per slot since we don't know in advance what alignment hotplugged backend would require. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list