2017-04-18 14:29+0200, Cornelia Huck: > On Tue, 18 Apr 2017 13:11:55 +0200 > David Hildenbrand <david@xxxxxxxxxx> wrote: >> On 13.04.2017 22:19, Radim Krčmář wrote: >> > new KVM_MAX_CONFIGURABLE_VCPUS, probably directly to INT_MAX/KVM_VCPU_ID, so we >> > don't have to worry about it for a while. >> > >> > PPC should be interested in this as they set KVM_MAX_VCPUS to NR_CPUS >> > and probably waste few pages for every guest this way. >> >> As we just store pointers, this should be a maximum of 4 pages for ppc >> (4k pages). Is this really worth yet another VM creation ioctl? Is there >> not a nicer way to handle this internally? >> >> An alternative might be to simply realloc the array when it reaches a >> certain size (on VCPU creation, maybe protecting the pointer via rcu). >> But not sure if something like that could work. > > I like that idea better, if it does work (I think it should be doable). > If we just double the array size every time we run out of space, we > should be able to do this with few reallocations. That has also the > advantage of being transparent to user space (other than increased > number of vcpus). Yes, relocating would work with protection against use-after-free and RCU fits well. (Readers don't have any lock we could piggyback on.) I didn't go for it because of C: the kvm_for_each_vcpu macro would be less robust if it included the locking around its body -- nested fors are susceptible to return/goto errors inside the loop body + we'd need to obfuscate several existing users of that pattern. And open-coding the protection everywhere is polluting the code too, IMO. Lock-less list would solve those problems, but we are accessing the VCPUs by index, which makes it suboptimal in other direction ... using the list for kvm_for_each_vcpu and adding RCU protected array for kvm_get_vcpu and kvm_get_vcpu_by_id looks like over-engineering as we wouldn't save memory, performance, nor lines of code by doing that. I didn't see a way to untangle kvm->vcpu that would allow a nice runtime-dynamic variant. We currently don't need to pass more information at VM creation time either, so I was also thinking of hijacking the parameter to KVM_CREATE_VM for factor-of-2 VCPU count (20 bits would last a while), but that is already a new interface and new IOCTL to do a superset of another one seemed much better. I agree that the idea is questionable. I'll redo the series and bump KVM_MAX_VCPUS unless you think that the dynamic could be done nicely. (The memory saving is a miniscule fraction of a VM size and if we do big increments in KVM_MAX_VCPUS, then the motivation is gone.) Thanks.