On 20.02.23 11:40, Mathias Krause wrote:
> On 14.02.23 11:33, Greg Kroah-Hartman wrote:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index da4bbd043a7b..50a95c8082fa 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5254,12 +5254,11 @@ static void kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu,
>> {
>> unsigned long val;
>>
>> + memset(dbgregs, 0, sizeof(*dbgregs));
>> memcpy(dbgregs->db, vcpu->arch.db, sizeof(vcpu->arch.db));
>> kvm_get_dr(vcpu, 6, &val);
>> dbgregs->dr6 = val;
>> dbgregs->dr7 = vcpu->arch.dr7;
>> - dbgregs->flags = 0;
>> - memset(&dbgregs->reserved, 0, sizeof(dbgregs->reserved));
>> }
>>
>> static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
>
> While this change handles the info leak for 32 bit kernels just fine, it
> completely ignores that the ABI is broken for such kernels. The bug
> (existing since the introduction of the API) effectively makes using
> DR1..3 impossible. The memcpy() will only copy half of dbgregs->db and
> effectively only allows setting DR0 to its intended value. The remaining
> registers get shuffled around (lower half of db[1] will end up in DR2,
> not DR1) or completely ignored (db[2..3] which should end up in DR3 and
> DR4). Now, this broken ABI might be considdered "API," so I gave it a
> look...
>
> A Debian code search gave only three real users of these ioctl()s:
> - VirtualBox ([1], lines 1735 ff.),
> - QEMU ([2], in kvm_put_debugregs(): lines 4491 ff. and
> kvm_get_debugregs(): lines 4515 ff.) and
> - Linux's KVM selftests ([3], lines 722 ff., used in vcpu_load_state()
> and vcpu_save_state()).
>
> Linux's selftest uses the API only to read and bounce back the state --
> doesn't do any sanity checks on it.
>
> VirtualBox and QEMU, OTOH, assume that the array is properly filled,
> i.e. indices 0..3 map to DR0..3. This means, these users are currently
> (and *always* have been) broken when trying to set DR1..3. Time to get
> them fixed before x86-32 vanishes into irrelevance.
>
> [1] https://www.virtualbox.org/browser/vbox/trunk/src/VBox/VMM/VMMR3/NEMR3Native-linux.cpp?rev=98193#L1735
> [2] https://gitlab.com/qemu-project/qemu/-/blob/v7.2.0/target/i386/kvm/kvm.c#L4480-4522
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/kvm/include/x86_64/processor.h?h=v6.2#n722
>
> An ABI-breaking^Wfixing change like below might be worth to apply on top
> to get that long standing bug fixed:
>
> -- >8 --
> Subject: [PATCH] KVM: x86: Fix broken debugregs ABI for 32 bit kernels
>
> The ioctl()s to get and set KVM's debug registers are broken for 32 bit
> kernels as they'd only copy half of the user register state because of
> the UAPI and in-kernel type mismatch (__u64 vs. unsigned long; 8 vs. 4
> bytes).
>
> This makes it impossible for userland to set anything but DR0 without
> resorting to bit folding tricks.
>
> Switch to a loop for copying debug registers that'll implicitly do the
> type conversion for us, if needed.
>
> This ABI breaking change actually fixes known users [1,2] that have been
> broken since the API's introduction in commit a1efbe77c1fd ("KVM: x86:
> Add support for saving&restoring debug registers").
>
> Also take 'dr6' from the arch part directly, as we do for 'dr7'. There's
> no need to take the clunky route via kvm_get_dr().
>
> [1] https://www.virtualbox.org/browser/vbox/trunk/src/VBox/VMM/VMMR3/NEMR3Native-linux.cpp?rev=98193#L1735
> [2] https://gitlab.com/qemu-project/qemu/-/blob/v7.2.0/target/i386/kvm/kvm.c#L4480-4522
>
> Fixes: a1efbe77c1fd ("KVM: x86: Add support for saving&restoring debug registers")
> Cc: stable <stable@xxxxxxxxxx> # needs 2c10b61421a2
> Signed-off-by: Mathias Krause <minipli@xxxxxxxxxxxxxx>
> ---
> arch/x86/kvm/x86.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a2c299d47e69..db3967de7958 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5261,18 +5261,23 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> static void kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu,
> struct kvm_debugregs *dbgregs)
> {
> - unsigned long val;
> + unsigned int i;
>
> memset(dbgregs, 0, sizeof(*dbgregs));
> - memcpy(dbgregs->db, vcpu->arch.db, sizeof(vcpu->arch.db));
> - kvm_get_dr(vcpu, 6, &val);
> - dbgregs->dr6 = val;
> +
> + BUILD_BUG_ON(ARRAY_SIZE(vcpu->arch.db) != ARRAY_SIZE(dbgregs->db));
> + for (i = 0; i < ARRAY_SIZE(vcpu->arch.db); i++)
> + dbgregs->db[i] = vcpu->arch.db[i];
> +
> + dbgregs->dr6 = vcpu->arch.dr6;
> dbgregs->dr7 = vcpu->arch.dr7;
> }
>
> static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
> struct kvm_debugregs *dbgregs)
> {
> + unsigned int i;
> +
> if (dbgregs->flags)
> return -EINVAL;
>
> @@ -5281,7 +5286,9 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
> if (!kvm_dr7_valid(dbgregs->dr7))
> return -EINVAL;
>
> - memcpy(vcpu->arch.db, dbgregs->db, sizeof(vcpu->arch.db));
> + for (i = 0; i < ARRAY_SIZE(vcpu->arch.db); i++)
> + vcpu->arch.db[i] = dbgregs->db[i];
> +
> kvm_update_dr0123(vcpu);
> vcpu->arch.dr6 = dbgregs->dr6;
> vcpu->arch.dr7 = dbgregs->dr7;
Ping? No interest in fixing this for i386?
I've attached a PoC showing the bug. It needs to be run on a 32-bit x86
kernel to actually trigger it. It does, however, only show half of the
bug, as the in-VM state differs from KVM's state, too. However, I was
too lazy to implement a full guest code execution setup (and the in-tree
selftests don't support i386), so you have to believe me or try to
follow my argumentation along with KVM's code.
Thanks,
Mathias
/*
* Test for KVM's KVM_{GET,SET}_DEBUGREGS ioctl() being broken for i386.
* It only copies half of the registers and interleaves them even.
*
* This test only shows half of the bug, as there's no way to retrieve the
* in-VM effective debug registers without actually running code in the VM
* itself (I was too lazy to implement :P). But if you would do, you'd see
* that on the first iteration already, where we want to set DR0 only, we
* actually do set DR1 too -- DR0 to 0xc0ffee0, DR1 to 0xbadc0de0. For the
* second iteration we fail to set DR1 to 0xc0ffee1 but instead set DR2 to
* 0xc0ffee1 and DR3 to 0xbadc0de1.
*
* Compile as:
* $ gcc -o dr_kvm dr_kvm.c
*
* Run as:
* $ ./dr_kvm
*
* If you're runnining a buggy 32-bit x86 kernel, the output will look like
* this:
* $ ./dr_kvm
* dr_kvm: unexpected value of DR0 (0xbadc0de00c0ffee0, should be 0xc0ffee0) after setting DR0
* dr_kvm: unexpected value of DR1 (0xbadc0de10c0ffee1, should be 0xc0ffee1) after setting DR1
* dr_kvm: unexpected value of DR2 (0x0, should be 0xc0ffee2) after setting DR2
* dr_kvm: unexpected value of DR3 (0x0, should be 0xc0ffee3) after setting DR3
*
* On a 64 bit kernel or a kernel shiping the fix[1], no output should be
* generated.
*
* [1] https://lore.kernel.org/kvm/20230220104050.419438-1-minipli@xxxxxxxxxxxxxx/
*
* - minipli
*/
#include <linux/kvm.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <err.h>
int main(void) {
static const char dev_kvm[] = "/dev/kvm";
struct kvm_debugregs dbg_regs;
int kvm, vm, cpu;
int res = 0;
int i, j;
kvm = open(dev_kvm, O_RDWR);
if (kvm < 0)
err(1, "open(%s)", dev_kvm);
vm = ioctl(kvm, KVM_CREATE_VM, NULL);
if (vm < 0)
err(1, "ioctl(KVM_CREATE_VM)");
if (ioctl(kvm, KVM_CHECK_EXTENSION, KVM_CAP_DEBUGREGS) == 0)
errx(1, "KVM_CAP_DEBUGREGS not supported!");
cpu = ioctl(vm, KVM_CREATE_VCPU, 0);
if (vm < 0)
err(1, "ioctl(KVM_CREATE_VCPU)");
for (i = 0; i < 4; i++) {
unsigned long db_val;
memset(&dbg_regs, 0, sizeof(dbg_regs));
dbg_regs.db[i] = 0xbadc0de0 | i;
dbg_regs.db[i] <<= 32;
dbg_regs.db[i] |= 0xc0ffee0 | i;
db_val = dbg_regs.db[i];
if (ioctl(cpu, KVM_SET_DEBUGREGS, &dbg_regs) != 0)
err(1, "ioctl(KVM_SET_DEBUGREGS)");
memset(&dbg_regs, 0, sizeof(dbg_regs));
if (ioctl(cpu, KVM_GET_DEBUGREGS, &dbg_regs) != 0)
err(1, "ioctl(KVM_GET_DEBUGREGS)");
for (j = 0; j < 4; j++) {
unsigned long expect = j == i ? db_val : 0;
if (dbg_regs.db[j] != expect) {
warnx("unexpected value of DR%d (0x%llx, should be 0x%lx) after setting DR%d",
j, dbg_regs.db[j], expect, i);
res++;
}
}
}
close(kvm);
close(cpu);
close(vm);
return res;
}