On Sun, Oct 20, 2024, Paolo Bonzini wrote: > On 10/10/24 19:48, Sean Christopherson wrote: > > On Thu, Oct 10, 2024, Paolo Bonzini wrote: ... > > > > @@ -4298,12 +4299,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) > > > > kvm_get_kvm(kvm); > > > > r = create_vcpu_fd(vcpu); > > > > if (r < 0) > > > > - goto kvm_put_xa_release; > > > > - > > > > - if (KVM_BUG_ON(xa_store(&kvm->vcpu_array, vcpu->vcpu_idx, vcpu, 0), kvm)) { > > > > - r = -EINVAL; > > > > - goto kvm_put_xa_release; > > > > - } > > > > + goto kvm_put_xa_erase; > > > > > > I also find it a bit jarring though that we have to undo the insertion. This > > > is a chicken-and-egg situation where you are pick one operation B that will > > > have to undo operation A if it fails. But what xa_store is doing, is > > > breaking this deadlock. > > > > > > The code is a bit longer, sure, but I don't see the point in complicating > > > the vcpu_array invariants and letting an entry disappear. > > > > But we only need one rule: vcpu_array[x] is valid if and only if 'x' is less than > > online_vcpus. And that rule is necessary regardless of whether or not vcpu_array[x] > > is filled before success is guaranteed. > > Even if the invariant is explainable I still find xa_erase to be uglier than > xa_release, but maybe it's just me. It's uglier, but has the distinct advantage of not being broken :-D And while uglier, IMO there's value in having a way for fuzzers to test KVM's online_vcpus logic. As evidenced by patches 1-3, accessing vcpu_array[] without first checking online_vcpus is dangerous regardless of how vcpu_array[] is populated. Allowing fuzzers to trigger removal vcpu_array[] in KASAN kernels provides meaningful coverage for that code (see Michal's original bug report). While well-intentioned, Michal's change in afb2acb2e3a3 simply blamed the wrong code. Denying ourselves that coverage and carrying flawed code just because the correct code isn't the prettiest doesn't seem like a good tradeoff. > The reason I'm not fully convinced by the explanation is that... > > > I'm not concerned about the code length, it's that we need to do _something_ if > > xa_store() fails. Yeah, it should never happen, but knowingly doing nothing feels > > all kinds of wrong. > > ... it seems to me that this is not just an issue in KVM code; it should > apply to other uses of xa_reserve()/xa_store() as well. If xa_store() fails > after xa_reserve(), you're pretty much using the xarray API incorrectly... > and then, just make it a BUG()? I know that BUG() is frowned upon, but if > the API causes invalid memory accesses when used incorrectly, one might as > well fail as early as possible and before the invalid memory access becomes > exploitable. > > > I don't like BUG(), because it's obviously very doable to > > gracefully handle failure. > > Yes, you can by using a different API. But the point is that in the > reserve/store case the insert failure becomes a reserve failure, never a > store failure. > > Maybe there should be an xa_store_reserved() that BUGs on failure, I don't > know. I agree a version of xa_store() that guarantees success would be nice to have, but I'm not exactly eager to potentially start a fight Willy *and* Linus :-)