Hi, This is very rough and early, but I wanted to get some feedback, possibly advice, and see if there's some interest in at least creating infrastructure for user contributed libvirt hooks, if not some default ones that are no-ops unless configured. The impetus for this is that I started trying to tune a dual-socket system for performance with device assignment and quickly became very frustrated that distros don't provide any built-in support for more than the very basics of hugepages. Yes, there are kernel commandline options, but those don't allow node specific configuration. Yes, there's 'virsh allocpages', but how does that get incorporated automatically into initializing libvirtd or a domain? Creating any sort of persistence for hugepages is an exercise for the user. So I think the first step in this is that the hooks scripts[1] should by default support sub-scripts in the common way, with a ".d" sub-directory holding those scripts, for example daemon.d and qemu.d. In the attached file, I've simply commandeered the default script to call the sub-scripts. For compatibility (ie. not overwriting user scripts), that should probably happen within libvirt. In any case, a single monolithic hook file is impossible to maintain on a system, let alone multiple systems, so this needs to be brought up to date. The second step is that even if we drop user contrib hooks out in /usr/share for admins to pull in as desired, perhaps we can provide some consistency for how to configure those hooks. In the example below I propose /etc/sysconfig/libvirt-hook-config.xml. You can see how currently it supports static and dynamic hugepage hooks, static occurring through the daemon hook and dynamic through the qemu hook. Ideally the dynamic hook would simply list the domain names participating in dynamic hugepages and figure out what needs to be allocated where from the domain xml. I haven't gotten that far (and frankly trying to satisfy cpu/numa vs numatune/memory|memnode vs memoryBacking/hugepages and memory size still looks very confusing to me). Do we want a common place to configure this sort of thing? Is XML the right format? On to the scripts themselves. I got some advice on #virt that I should use 'virsh allocpages' to manage hugepages. Despite the warning not to call into libvirt in the hook documentation[1], I was assured it'd be ok here. However, somehow 'virsh freepages' did manage to hang and my RHEL7.1 system doesn't support allocpages yet, so my prototype uses raw sysfs. AFAICT, any sort of hugepage manipulation is inherently broken because of the racy kernel interfaces. We really need a hugepage broker, but that's well beyond the scope of libvirt. Functionally this seems to work well for me. I don't know how practical it is to support dynamic 1G pages; I'd probably encourage static setup for that as my system only survived a couple rounds before getting too fragmented. 2M dynamic seems to work quite nicely though. TL;DR, I thought I'd post this, even in a rough state to see if there's interest, get nitpicks at my terrible scripting, and make sure I'm not just scratching my own itch. Thanks, Alex [1] https://www.libvirt.org/hooks.html
Attachment:
libvirt-hook-config.xml
Description: XML document
Attachment:
daemon
Description: application/shellscript
Attachment:
qemu
Description: application/shellscript
Attachment:
static-hugepages.sh
Description: application/shellscript
Attachment:
dynamic-hugepages.sh
Description: application/shellscript
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list