Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25.01.2014, at 02:58, Scott Wood <scottwood@xxxxxxxxxxxxx> wrote:

> On Sat, 2014-01-25 at 00:24 +0000, Peter Maydell wrote:
>> On 24 January 2014 23:51, Scott Wood <scottwood@xxxxxxxxxxxxx> wrote:
>>> On Fri, 2014-01-24 at 15:39 -0800, Christoffer Dall wrote:
>>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>>>> index 366bf4b..6dbd68c 100644
>>>> --- a/Documentation/virtual/kvm/api.txt
>>>> +++ b/Documentation/virtual/kvm/api.txt
>>>> @@ -2565,6 +2565,11 @@ executed a memory-mapped I/O instruction which could not be satisfied
>>>> by kvm.  The 'data' member contains the written data if 'is_write' is
>>>> true, and should be filled by application code otherwise.
>>>> 
>>>> +The 'data' member byte order is host kernel native endianness, regardless of
>>>> +the endianness of the guest, and represents the the value as it would go on the
>>>> +bus in real hardware.  The host user space should always be able to do:
>>>> +<type> val = *((<type> *)mmio.data).
>>> 
>>> Host userspace should be able to do that with what results?  It would
>>> only produce a directly usable value if host endianness is the same as
>>> the emulated device's endianness.
>> 
>> With the result that it gets the value the CPU has sent out on
>> the bus as the memory transaction.
> 
> Doesn't that assume the host kernel endianness is the same as the bus
> (or rather, that the host CPU would not swap such an access before it
> hits the bus)?
> 
> If you take the same hardware and boot a little endian host kernel one
> day, and a big endian host kernel the next, the bus doesn't change, and
> neither should the bytewise (assuming address invariance) contents of
> data[].  How data[] would look when read as a larger integer would of
> course change -- but that's due to how you're reading it.
> 
> It's clear to say that a value in memory has been stored there in host
> endianness when the value is as you would want to see it in a CPU
> register, but it's less clear when you talk about it relative to values
> on a bus.  It's harder to correlate that to something that is software
> visible.
> 
> I don't think there's any actual technical difference between your
> wording and mine when each wording is properly interpreted, but I
> suspect my wording is less likely to be misinterpreted (I could be
> wrong).
> 
>> Obviously if what userspace
>> is emulating is a bus which has a byteswapping bridge or if it's
>> being helpful to device emulation by providing "here's the value
>> even though you think you're wired up backwards" then it needs
>> to byteswap.
> 
> Whether the emulated bus has "a byteswapping bridge" doesn't sound like
> something that depends on the endianness that the host CPU is currently
> running in.
> 
>>> How about a wording like this:
>>> 
>>>  The 'data' member contains, in its first 'len' bytes, the value as it
>>>  would appear if the guest had accessed memory rather than I/O.
>> 
>> I think this is confusing, because now userspace authors have
>> to figure out how to get back to "value X of size Y at address Z"
>> by interpreting this text... Can you write out the equivalent of
>> Christoffer's text "here's how you get the memory transaction
>> value" for what you want?
> 
> Userspace swaps the value if and only if userspace's endianness differs
> from the endianness with which the device interprets the data
> (regardless of whether said interpretation is considered natural or
> swapped relative to the way the bus is documented).  It's similar to how
> userspace would handle emulating DMA.
> 
> KVM swaps the value if and only if the endianness of the guest access
> differs from that of the host, i.e. if it would have done swapping when
> emulating an ordinary memory access.
> 
>> (Also, value as it would appear to who?)
> 
> As it would appear to anyone.  It works because data[] actually is
> memory.  Any difference in how data appears based on the reader's
> context would already be reflected when the reader performs the load.
> 
>> I think your wording implies that the order of bytes in data[] depend
>> on the guest CPU "usual byte order", ie the order which the CPU
>> does not do a byte-lane-swap for (LE for ARM, BE for PPC),
>> and it would mean it would come out differently from
>> my/Alex/Christoffer's proposal if the host kernel was the opposite
>> endianness from that "usual" order.
> 
> It doesn't depend on "usual" anything.  The only thing it implicitly
> says about guest byte order is that it's KVM's job to implement any
> swapping if the endianness of the guest access is different from the
> endianness of the host kernel access (whether it's due to the guest's
> mode, the way a page is mapped, the instruction used, etc).
> 
>> Finally, I think it's a bit confusing in that "as if the guest had
>> accessed memory" is assigning implicit semantics to memory
>> in the emulated system, when memory is actually kind of outside
>> KVM's purview because it's not part of the CPU.
> 
> That's sort of the point.  It defines it in a way that is independent of
> the CPU, and thus independent of what endianness the CPU operates in.

Ok, let's go through the combinations for a 32-bit write of 0x01020304 on PPC and what data[] looks like

your proposal:

  BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
  LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
  BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
  LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }

-> ldw_p() will give us the correct value to work with

current proposal:

  BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
  LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
  BE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
  LE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }

-> *(uint32_t*)data will give us the correct value to work with


There are pros and cons for both approaches.

Pro approach 1 is that it fits the way data[] is read today, so no QEMU changes are required. However, it means that user space needs to have awareness of the "default endianness".
With approach 2 you don't care about endianness at all anymore - you just get a payload that the host process can read in.

Obviously both approaches would work as long as they're properly defined :).


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux