On Tue, Dec 10, 2019 at 02:54:22PM +0100, Andrea Bolognani wrote: > This patch is intended to start a slightly larger discussion about > our plans for the CentOS CI environment going forward. > > At the moment, we have active builders for > > CentOS 7 > Debian 9 > Debian 10 > Fedora 30 > Fedora 31 > Fedora Rawhide > FreeBSD 11 > FreeBSD 12 > > but we don't have builder for > > Debian sid > FreeBSD -CURRENT > Ubuntu 16.04 > Ubuntu 18.04 > > despite them being fully supported in the libvirt-jenkins-ci > repository. > > This makes sense for sid and -CURRENT, since the former covers the > same "freshest Linux packages" angle that Rawhide already takes care > of and the latter is often broken and not trivial to keep updated; > both Ubuntu targets, however, should IMHO be part of the CentOS CI > environment. Hence this series :) > > Moreover, we're in the process of adding > > CentOS 8 > openSUSE Leap 15.1 > openSUSE Tumbleweed > > as targets, of which the first two should also IMHO be added as they > would provide useful additional coverage. > > The only reason why I'm even questioning whether this should be done > is capacity for the hypervisor host: the machine we're running all > builders on has > > CPUs: 8 > Memory: 32 GiB > Storage: 450 GiB > > and each of the guests is configured to use > > CPUs: 2 > Memory: 2 GiB > Storage: 20 GiB > > So while we're good, and actually have plenty of room to grow, on > the memory and storage front, we're already overcommitting our CPUs > pretty significantly, which I guess is at least part of the reason > why builds take so long. NB the memory that's free is not really free - it is being usefull as I/O cache for the VM disks. So more VMs will reduce I/O cache. Whether that will actually impact us I don't know though. More importantly though, AFAICT, those are not 8 real CPUs. virsh nodeinfo reports 8 cores, but virsh capabilities reports it as a 1 socket, 4 core, 2 thread CPU. IOW we haven't really got 8 CPUs, more like equivalent of 5 CPUs. as HT only really gives a x1.3 boost in best case, and I suspect builds are not likely to be hitting the best case. > Can we afford to add 50% more load on the machine without making it > unusable? I don't know. But I think it would be worthwhile to at > least try and see how it handles an additional 25%, which is exactly > what this series does. Giving it a try is ok I guess. I expect there's probably more we can do to optimize the setup too. For example, what actual features of qcow2 are we using ? We're not snapshotting VMs, we don't need grow-on-demand allocation. AFACT we're paying the performance cost of qcow2 (l1/l2 table lookups & metadata caching), for no reason. Switch the VMs to fully pre-allocated raw files may improve I/O performance. Raw LVM VGs would be even better but that will be painful to setup given the host install setup. I also wonder if we have the optimal aio setting for disks, as there's nothing in the XML. We could consider using cache=unsafe for VMs, though for that I think we'd want to separate off a separate disk for /home/jenkins so that if there was a host OS crash, we wouldn't have to rebuild the entire VMs - just throw away the data disk & recreate. Since we've got plenty of RAM, another obvious thing would be to turn on huge pages and use them for all guest RAM. This may well have a very significant performance boost from reducing CPU overhead which is our biggest bottleneck. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list