On Mon, Sep 04, 2017 at 04:14:00PM +0200, Martin Kletzander wrote: > * The current design (finally something libvirt-related, right?) > > The discussion ended with a conclusion of the following (with my best > knowledge, there were so many discussions about so many things that I > would spend too much time looking up all of them): > > - Users should not need to specify bit masks, such complexity should be > abstracted. We'll use sizes (e.g. 4MB) > > - Multiple vCPUs might need to share the same allocation. > > - Exclusivity of allocations is to be assumed, that is only unoccupied > cache should be used for new allocations. > > The last point seems trivial but it's actually very specific condition > that, if removed, can cause several problems. If it's hard to grasp the > last point together with the second one, you're on the right track. If > not, then I'll try to make a point for why the last point should be > removed in 3... 2... 1... > > * Design flaws > > 1) Users cannot specify any allocation that would share only part with > some other allocation of the domain or the default group. > > 2) It was not specified what to do with the default resource group. > There might be several ways to approach this, with varying pros and > cons: > > a) Treat it as any other group. That is any bit set for this group > will be excluded from usable bits when creating new allocation > for a domain. > > - Very predictable behaviour > > - You will not be able to allocate any amount of cache without > previous setting for the default group as that will have all > the bits set which will make all the cache unusable > > b) Automatically remove the appropriate amount of bits that are > needed for new domains. > > - No need to do any change to the system settings in order to > use this new feature > > - We would have to change system settings, which is generally > frowned upon when done "automatically" as a side effect of > starting a domain, especially for such scarce resource as > cache > > - The change to system settings would not be entirely > predictable > > c) Act like it doesn't exist and don't remove its allocations from > consideration > > - Doesn't really make sense as system processes might be > trashing the cache as any VM, moreover when all VM processes > without allocations will be based in the default group as > well > > 3) There is no way for users to know what the particular settings are > for any running domain. > > The first point was deemed a corner case. Fair enough on its own, but > considering point 2 and its solutions, it is rather difficult for me to > justify it. Also, let's say you have domain with 4 vCPUs out of which > you know 1 might be trashing the cache, but you don't want to restrict > it completely, but others will utilize it very nicely. Sensible > allocations for such domain's vCPUs might be: > > vCPU 0: 000f > vCPUs 1-3: ffff > > as you want vCPUs 1-3 to utilize even the part of cache that might get > trashed by vCPU 0. Or they might share some data (especially > guest-memory-related). > > The case above is not possible to set up with only per-vcpu(s) scalar > setting. And there are more as you might imagine now. For example how > do we behave with iothreads and emulator threads? Ok, I see what you're getting at. I've actually forgotten what our current design looks like though :-) What level of granularity were we allowing within a guest ? All vCPUs use separate cache regions from each other, or all vCPUs use a share cached region, but separate from other guests, or a mix ? > * My suggestion: > > - Provide an API for querying and changing the allocation of the > default resource group. This would be similar to setting and > querying hugepage allocations (see virsh's freepages/allocpages > commands). Reasonable > - Let users specify the starting position in addition to the size, i.e. > not only specifying "size", but also "from". If "from" is not > specified, the whole allocation must be exclusive. If "from" is > specified it will be set without checking for collisions. The latter > needs them to query the system or know what settings are applied > (this should be the case all the time), but is better then adding > non-specific and/or meaningless exclusivity settings (how do you > specify part-exclusivity of the cache as in the example above) I'm concerned about the idea of not checking 'from' for collisions, if there's allowed a mix of guests with & within 'from'. eg consider * Initially 24 MB of cache is free, starting at 8MB * run guest A from=8M, size=8M * run guest B size=8M => libvirt sets from=16M, so doesn't clash with A * stop guest A * run guest C size=8M => libvirt sets from=8M, so doesn't clash with B * restart guest A => now clashes with guest C, whereas if you had left guest A running, then C would have got from=24MB and avoided clash IOW, if we're to allow users to set 'from', I think we need to have an explicit flag to indicate whether this is an exclusive or shared allocation. That way guest A would set 'exclusive', and so at least see an error when it got a clash with guest C in the example. > - After starting a domain, fill in any missing information about the > allocation (I'm generalizing here, but fro now it would only be the > optional "from" attribute) > > - Add settings not only for vCPUs, but also for other threads as we do > with pinning, schedulers, etc. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list