On Mon, Mar 13, 2017 at 11:58:24AM -0400, Luiz Capitulino wrote: > > Libvirt commit c2e60ad0e51 added a new check to the XML validation > logic where XMLs containing <memoryBacking><mlocked/> must also > contain <memtune><hard_limit>. This causes two breakages where > working guests won't start anymore: > > 1. Systems where mlock limit was set in /etc/security/limits.conf I'm surprised if that has any effect, unless you were setting it against the root user. The limits.conf file is loaded by PAM, and when libvirtd spawns QEMU, PAM is not involved, so limits.conf will never be activated. This is why libvirt provides max_processes/max_files/max_core settings in /etc/libvirt/qemu.conf - you can't set those in limits.conf and have them work - unless you set them against root, so libvirtd itself got the higher limits which are then inherited by QEMU. > 2. Guests using hugeTLB pages. In this case, guests were relying > on the fact that libvirt automagically increases mlock > limit to 1GB Yep, that's bad - we mustn't break previously working scenarios like this, even if there were not following documented practice. > While it's true that <memoryBacking><mlocked/> documentation > says that <memtune><hard_limit> is required, this is actually > an extremely bad request because: > > A. <memtune><hard_limit> own documention strongly recommends > NOT using it Yep, hard limit is impossible to calculate reliably since no one has been able to provide an accurate way to predict QEMU's peak memory usage. When libvirt previously set hard_limit by default, we got many bug reports about guest's killed by the OOM killer, no matter what algorithm we tried. > B. <memtune><hard_limit> does more than setting memory locking > limit > > C. <memtune><hard_limit> does not support infinity, so you have > to guess a limit > > D. If <memtune><hard_limit> is less than 1GB, it will lower > VFIO's automatic limit of "guest memory + 1GB" > > Here's two possible solutions to fix this all: > > 1. Drop change c2e60ad0e51 and drop automatic increases. Let > users configure limits in /etc/security/limits.conf > > pros: this is the most correct way to do it, and how > it should be done originally IMHO > > cons: will break working VFIO setups, so probably undoable limits.conf is useless - see above. > 2. Drop change c2e60ad0e51 and automtically increase memory > locking limit to infinity when seeing <memoryBacking><locked/> > > pros: make all cases work, no more <hard_limit> requirement > > cons: allows guests with <locked/> to lock all memory > assigned to them plus QEMU allocations. While this seems > undesirable or even a security issue, using <hard_limit> > will have the same effect I think this is the only viable approach, given that no one can provide a way to reliably calculate QEMU peak memory usage. Unless we want to take guest RAM + $LARGE NUMBER - eg just blindly assume that 2 GB is enough for QEMU working set, so for an 8 GB guest, just set 10 GB as the limit. > Lastly, <locked/> doesn't belong to <memoryBacking>, it should > be in <memtune>. I recommend deprecating it from <memoryBacking> > and adding it where it belongs. We never make these kind of changes in libvirt XML. It is sub-optimal location, but it has no functional problem, so there's no functional benefit to moving it and clear backcompat downsides. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list