On 1/20/2022 9:01 AM, Sean Christopherson wrote:
On Wed, Jan 19, 2022, Zeng Guang wrote:
It's self-adaptive , standalone function module in kvm, no any extra
limitation introduced
I disagree. Its failure mode on OOM is to degrade guest performance, _that_ is
a limitation. OOM is absolutely something that should be immediately communicated
to userspace in a way that userspace can take action.
If memory allocation fails, PID-pointer table stop updating and keep using
the old one. All IPIs from other vcpus will go through APIC-Write VM-exits
and won't get performance improvement from IPI virtualization to this new
created vcpu. Right, it's a limitation though it doesn't impact the
effectiveness
of IPI virtualization among existing vcpus.
and scalable even future extension on KVM_MAX_VCPU_IDS or new apic id
implementation released.
How do you think ? :)
Heh, I think I've made it quite clear that I think it's unnecesary complexity in
KVM. It's not a hill I'll die on, e.g. if Paolo and others feel it's the right
approach then so be it, but I really, really dislike the idea of dynamically
changing the table, KVM has a long and sordid history of botching those types
of flows/features.
To follow your proposal, we think about the feasible implementation as
follows:
1. Define new parameter apic_id_limit in struct kvm_arch and initialized
as KVM_MAX_VCPU_IDS by default.
2. New vm ioclt KVM_SET_APICID_LIMIT to allow user space set the possible
max apic id required in the vm session before vcpu creation. Currently
QEMU calculates the limit to CPU APIC ID up to max cpus assigned for
hotpluggable cpu. It simply uses package/die/core/smt model to get bit
width of id field on each level (not totally comply with CPUID 1f/0b) and
make apic id for specific vcpu index. We can notify kvm this apic id limit
to ensure memory enough for PID-table.
3. Need check whether id is less than min(apic_id_limit, KVM_MAX_VCPU_IDS)
in vcpu creation. Otherwise return error.
4. Allocate memory covering vcpus with the id up to apic_id_limit for PID
table during the first vcpu creation. Proper lock still needed to
protect PID
table setup from race condition. If OOM happens, current vcpu creation
fails either and return error back to user space.
Plz let us know whether we can go for this solution further. Thanks.