Re: [RFC 0/5] Making KVM_GET_ONE_REG/KVM_SET_ONE_REG generic.

Avi Kivity <avi@xxxxxxxxxx> · Thu, 06 Sep 2012 18:16:06 +0300

On 09/06/2012 06:08 PM, Alexander Graf wrote:
> 
> 
> On 06.09.2012, at 10:48, Avi Kivity <avi@xxxxxxxxxx> wrote:
> 
>> On 09/05/2012 09:48 AM, Rusty Russell wrote:
>>> Peter Maydell <peter.maydell@xxxxxxxxxx> writes:
>>>> On 1 September 2012 13:28, Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:
>>>>> Rusty Russell (8):
>>>>>      KVM: ARM: Fix walk_msrs()
>>>>>      KVM: Move KVM_SET_ONE_REG/KVM_GET_ONE_REG to generic code.
>>>>>      KVM: Add KVM_REG_SIZE() helper.
>>>>>      KVM: ARM: use KVM_SET_ONE_REG/KVM_GET_ONE_REG.
>>>>>      KVM: Add KVM_VCPU_GET_REG_LIST.
>>>>>      KVM: ARM: Use KVM_VCPU_GET_REG_LIST.
>>>>>      KVM: ARM: Access all registers via KVM_GET_ONE_REG/KVM_SET_ONE_REG.
>>>>>      KVM ARM: Update api.txt
>>>> 
>>>> So I was thinking about this, and I remembered that the SET_ONE_REG/
>>>> GET_ONE_REG API has userspace pass a pointer to the variable the
>>>> kernel should read/write (unlike the _MSR x86 ioctls, where the
>>>> actual data value is sent back and forth in the struct). Further,
>>>> the kernel only writes a data value of the size of the register
>>>> (rather than always reading/writing a uint64_t).
>>>> 
>>>> This is a problem because it means userspace needs to know the
>>>> size of each register, and the kernel doesn't provide any way
>>>> to determine the size. This defeats the idea that userspace should
>>>> be able to migrate kernel register state without having to know
>>>> the semantics of all the registers involved.
>>> 
>>> It's there.  There are bits in the id which indicate the size:
>>> 
>>> #define KVM_REG_SIZE_SHIFT    52
>>> #define KVM_REG_SIZE_MASK    0x00f0000000000000ULL
>>> #define KVM_REG_SIZE_U8        0x0000000000000000ULL
>>> #define KVM_REG_SIZE_U16    0x0010000000000000ULL
>>> #define KVM_REG_SIZE_U32    0x0020000000000000ULL
>>> #define KVM_REG_SIZE_U64    0x0030000000000000ULL
>>> #define KVM_REG_SIZE_U128    0x0040000000000000ULL
>>> #define KVM_REG_SIZE_U256    0x0050000000000000ULL
>>> #define KVM_REG_SIZE_U512    0x0060000000000000ULL
>>> #define KVM_REG_SIZE_U1024    0x0070000000000000ULL
>>> 
>> 
>> Assumes power-of-two registers.  On x86 IDTR is 10 bytes long (2 byte
>> limit, 8 byte address).  We could split it into two registers, or add
>> padding, but it's unnatural.

(and the APIC, if treated as one-large-register) is 4k)

> 
> Why is padding bad?

Where does it come? between the 2 byte and the 8 byte element?  After
the 10 bytes?

It means that users must either include the padding in their internal
data structures, or copy to a temporary.

> How do you model IDTR throughout the stack today? 

struct kvm_dtable {
	__u64 base;
	__u16 limit;
	__u16 padding[3];
};

:p

Internally, it's held in hardware registers.

> How does QEMU's savevm serialize it?

Two separate fields (actually four, of which two are ignored).

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html