Re: [PATCH 1/2] KVM: Reject overly excessive IDs in KVM_CREATE_VCPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06.06.24 16:55, Sean Christopherson wrote:
> On Thu, Jun 06, 2024, Mathias Krause wrote:
>> On 06.06.24 00:31, Sean Christopherson wrote:
>>> On Thu, Jun 06, 2024, Mathias Krause wrote:
>>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>>> index 14841acb8b95..9f18fc42f018 100644
>>>> --- a/virt/kvm/kvm_main.c
>>>> +++ b/virt/kvm/kvm_main.c
>>>> @@ -4200,7 +4200,7 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
>>>>  /*
>>>>   * Creates some virtual cpus.  Good luck creating more than one.
>>>>   */
>>>> -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
>>>> +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id)
>>>
>>> Hmm, I don't love that KVM subtly relies on the KVM_MAX_VCPU_IDS check to guard
>>> against truncation when passing @id to kvm_arch_vcpu_precreate(), kvm_vcpu_init(),
>>> etc.  I doubt that it will ever be problematic, but it _looks_ like a bug.
>>
>> It's not subtle but very explicit. KVM_MAX_VCPU_IDS is a small positive
>> number, depending on some arch specific #define, but with x86 allowing
>> for the largest value of 4 * 4096. That value, for sure, cannot exceed
>> U32_MAX, so an explicit truncation isn't needed as the upper bits will
>> already be zero if the limit check passes.
>>
>> While subtile integer truncation is the bug that my patch is actually
>> fixing, it is for the *userland* facing part of it, as in clarifying the
>> ABI to work on "machine-sized words", i.e. a ulong, and doing the limit
>> checks on these.
>>
>> *In-kernel* APIs truncate / sign extend / mix signed/unsigned values all
>> the time. The kernel is full of these. Trying to "fix" them all is an
>> uphill battle not worth fighting, imho.
> 
> Oh, I'm not worry about something going wrong with the actual truncation.

Well, I do ;)

> 
> What I don't like is the primary in-kernal API, kvm_vm_ioctl_create_vcpu(), taking
> an unsigned long, but everything underneath converting that to an unsigned int,
> without much of anything to give the reader a clue that the truncation is
> deliberate.

Well, again, it's clear to me, at least. kvm_vm_ioctl_create_vcpu() is
the barrier from converting a userland provided "raw" value and checking
it to be within bounds that are sensible for in-kernel use. After that
check it's fine to use a more narrow type that still fits these bounds
and use that for in-kernel use.

The first part is _completely_ handled by the 'id >= KVM_MAX_VCPU_IDS'
test, as 'id' is still the "raw" value userland provided. Testing it
against KVM_MAX_VCPU_IDS does the "sensible for in-kernel use" check
and, on passing that check, allows to limit the storage type for 'id' to
the bounds of [0,KVM_MAX_VCPU_IDS) which fits within u32, unsigned int
or, as actually used for the vcpu_id member, an int.¹

> 
> Similarly, without the context of the changelog, it's not at all obvious why
> kvm_vm_ioctl_create_vcpu() takes an unsigned long.

Well, it's the ioctl() entry path. Passing on the UABI value shouldn't
be all that surprising. But I agree, a comment explaining the type- and
value handling to avoid early truncation thereof may not be such a bad idea.

> 
> E.g. x86 has another potentially more restrictive check on @id, and it looks
> quite odd to check @id against KVM_MAX_VCPU_IDS as an "unsigned long" in flow
> flow, but as an "unsigned int" in another.

Again, that's two distinct things, even if looking similar. The first
check against KVM_MAX_VCPU_IDS does actually two things:

1/ Ensure the full user ABI provided value (ulong) is sane and
2/ ensure it's within the hard limits KVM expects (fits unsigned int).

Now we do both with only a single compare and maybe that's what's so
hard to grasp -- that a single check can do both things. But why not
make use of simple things when we can do so?

Regarding x86's kvm_arch_vcpu_precreate(), it can use an 'unsigned int'
because the KVM_MAX_VCPU_IDS check ensured that the value of 'id' cannot
exceed that (limited) type. It's after the "UABI raw value" to "fits
kernel-internal type" conversion has happened, so using the narrower
type is just fine.

> 
> int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id)
> {
> 	if (kvm_check_tsc_unstable() && kvm->created_vcpus)
> 		pr_warn_once("SMP vm created on host with unstable TSC; "
> 			     "guest TSC will not be reliable\n");
> 
> 	if (!kvm->arch.max_vcpu_ids)
> 		kvm->arch.max_vcpu_ids = KVM_MAX_VCPU_IDS;
> 
> 	if (id >= kvm->arch.max_vcpu_ids)
> 		return -EINVAL;
> 

Above is completely fine, as this code only executes after the narrower
type bounds check.

>> I'd rather suggest to add a build time assert instead, as the existing
>> runtime check is sufficient (with my u32->ulong change). Something like
>> this:
>>
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -4200,12 +4200,13 @@ static void kvm_create_vcpu_debugfs(struct
>> kvm_vcpu *vcpu)
>>  /*
>>   * Creates some virtual cpus.  Good luck creating more than one.
>>   */
>> -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
>> +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id)
>>  {
>>         int r;
>>         struct kvm_vcpu *vcpu;
>>         struct page *page;
>>
>> +       BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX);
> 
> This should be UINT_MAX, no?

No, I chose INT_MAX very intentional, as the underlying type of
'vcpu_id' is actually an int. Trying to store a value that's greater
than INT_MAX (but smaller than UINT_MAX) will make it negative and
that's definitely something we need to prevent as otherwise array
indexed accesses like in the IOAPIC code would try to access memory
before the allocated storage. Not good!

>                               Regardless, the "need" for an explicit BUILD_BUG_ON()
> is another reason I dislike relying on the KVM_MAX_VCPU_IDS check to detect
> truncation.

There's no "need" for the BUILD_BUG_ON(). It's just a cheap (compile
time only, no runtime "overhead") assert that the code won't allow
truncated values which may lead to follow-up bugs because of unintended
truncation. And, after all, you suggested something like that (a
truncation check) yourself. I just tried to provide it as something that
doesn't need the odd '__id' argument and an explicit truncation check
which would do the wrong thing if we would like to push KVM_MAX_VCPU_IDS
above UINT_MAX (failing only at runtime, not at compile time).

>              If @id is checked as a 32-bit value, and we somehow screw up and
> define KVM_MAX_VCPU_IDS to be a 64-bit value, clang will rightly complain that
> the check is useless, e.g. given "#define KVM_MAX_VCPU_ID_TEST	BIT(32)"
> 
> arch/x86/kvm/x86.c:12171:9: error: result of comparison of constant 4294967296 with
> expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
>         if (id > KVM_MAX_VCPU_ID_TEST)
>             ~~ ^ ~~~~~~~~~~~~~~~~~~~~
> 1 error generated.
  ^^^^^^^^^^^^^^^^^^
Perfect! So this breaks the build. How much better can we prevent this
bug from going unnoticed?

Same for the BUILD_BUG_ON(). I don't see how it is somehow hiding things
or making future changes silently fail. They won't. Neither will the x86
specific code compile, nor will the BUILD_BUG_ON() in
kvm_vm_ioctl_create_vcpu(). So what you're trying to say?...

> 
> 
>>         if (id >= KVM_MAX_VCPU_IDS)
>>                 return -EINVAL;
> 
> What if we do an explicit check before calling kvm_vm_ioctl_create_vcpu()?  That
> would avoid the weird __id param, and provide a convenient location to document
> exactly why KVM checks for truncation.

My version has no "weird __id param" ;) And, honestly, the current u32
argument type makes no sense either. It's neither the final type used
for 'vcpu_id' nor does it have to be explicitly 32 bits. So, IMHO, the
argument type should be fixed in any case and using the type that
preserves the UABI value just seems like a natural fit.

kvm_vm_ioctl() is "only" the command multiplexer, deferring concrete
implementation to the individual functions. It currently does no input
value sanity checks itself, beside some user copy error checks. So why
should KVM_CREATE_VCPU be special and have its truncation check be
outside of the handler function? It just scatters the code and hurts
readability, IMHO.

> 
> We could also move the "if (id >= KVM_MAX_VCPU_IDS)" check to kvm_vm_ioctl(),
> but I don't love that, because again IMO it makes the code less readable overall,
> loses clang's tuautological constant check, and the cost of the extra check against
> UINT_MAX is completely negligible.  
> 
> Though if I had to choose, I'd prefer moving the check to kvm_vm_ioctl() over
> taking an "unsigned long" in kvm_vm_ioctl_create_vcpu().

Looks like we disagree again. I'd always choose the ulong argument over
scattering the checks over multiple functions and possibly missing it if
a new call site appears.

> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 4965196cad58..8155146b16cd 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5083,6 +5083,13 @@ static long kvm_vm_ioctl(struct file *filp,
>                 return -EIO;
>         switch (ioctl) {
>         case KVM_CREATE_VCPU:
> +               /*
> +                * KVM tracks vCPU ID as a 32-bit value, be kind to userspace
> +                * and reject too-large values instead of silently truncating.
> +                */
> +               if (arg > UINT_MAX)
> +                       return -EINVAL;

Unfortunately, that check is a tautology for 32 bit systems and wrong
for reasons I explained in the beginning of my Email. But we can move
the comment to kvm_vm_ioctl_create_vcpu() and explain the rational over
there (with taking an 'unsigned long id' argument):

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 14841acb8b95..b04e87f6568f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4200,12 +4200,20 @@ static void kvm_create_vcpu_debugfs(struct
kvm_vcpu *vcpu)
 /*
  * Creates some virtual cpus.  Good luck creating more than one.
  */
-static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
+static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id)
 {
    int r;
    struct kvm_vcpu *vcpu;
    struct page *page;

+   /*
+    * KVM tracks vCPU IDs as 'int', be kind to userspace and reject
+    * too-large values instead of silently truncating.
+    *
+    * Also ensure we're not breaking this assumption by accidentally
+    * pushing KVM_MAX_VCPU_IDS above INT_MAX.
+    */
+   BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX);
    if (id >= KVM_MAX_VCPU_IDS)
        return -EINVAL;

> +
>                 r = kvm_vm_ioctl_create_vcpu(kvm, arg);
>                 break;
>         case KVM_ENABLE_CAP: {

Cheers,
Mathias

¹ IMHO, using 'int' for vcpu_id is actually *very* *wrong*, as it's used
as an index in certain constructs and having a signed type doesn't feel
right at all. But that's just a side matter, as, according to the checks
on the ioctl() path, the actual value of vcpu_id can never be negative.
So lets not distract.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux