Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 25, 2014 at 03:15:35AM +0100, Alexander Graf wrote:
> 
> On 25.01.2014, at 02:58, Scott Wood <scottwood@xxxxxxxxxxxxx> wrote:
> 
> > On Sat, 2014-01-25 at 00:24 +0000, Peter Maydell wrote:
> >> On 24 January 2014 23:51, Scott Wood <scottwood@xxxxxxxxxxxxx> wrote:
> >>> On Fri, 2014-01-24 at 15:39 -0800, Christoffer Dall wrote:
> >>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> >>>> index 366bf4b..6dbd68c 100644
> >>>> --- a/Documentation/virtual/kvm/api.txt
> >>>> +++ b/Documentation/virtual/kvm/api.txt
> >>>> @@ -2565,6 +2565,11 @@ executed a memory-mapped I/O instruction which could not be satisfied
> >>>> by kvm.  The 'data' member contains the written data if 'is_write' is
> >>>> true, and should be filled by application code otherwise.
> >>>> 
> >>>> +The 'data' member byte order is host kernel native endianness, regardless of
> >>>> +the endianness of the guest, and represents the the value as it would go on the
> >>>> +bus in real hardware.  The host user space should always be able to do:
> >>>> +<type> val = *((<type> *)mmio.data).
> >>> 
> >>> Host userspace should be able to do that with what results?  It would
> >>> only produce a directly usable value if host endianness is the same as
> >>> the emulated device's endianness.
> >> 
> >> With the result that it gets the value the CPU has sent out on
> >> the bus as the memory transaction.
> > 
> > Doesn't that assume the host kernel endianness is the same as the bus
> > (or rather, that the host CPU would not swap such an access before it
> > hits the bus)?
> > 
> > If you take the same hardware and boot a little endian host kernel one
> > day, and a big endian host kernel the next, the bus doesn't change, and
> > neither should the bytewise (assuming address invariance) contents of
> > data[].  How data[] would look when read as a larger integer would of
> > course change -- but that's due to how you're reading it.
> > 
> > It's clear to say that a value in memory has been stored there in host
> > endianness when the value is as you would want to see it in a CPU
> > register, but it's less clear when you talk about it relative to values
> > on a bus.  It's harder to correlate that to something that is software
> > visible.
> > 
> > I don't think there's any actual technical difference between your
> > wording and mine when each wording is properly interpreted, but I
> > suspect my wording is less likely to be misinterpreted (I could be
> > wrong).
> > 
> >> Obviously if what userspace
> >> is emulating is a bus which has a byteswapping bridge or if it's
> >> being helpful to device emulation by providing "here's the value
> >> even though you think you're wired up backwards" then it needs
> >> to byteswap.
> > 
> > Whether the emulated bus has "a byteswapping bridge" doesn't sound like
> > something that depends on the endianness that the host CPU is currently
> > running in.
> > 
> >>> How about a wording like this:
> >>> 
> >>>  The 'data' member contains, in its first 'len' bytes, the value as it
> >>>  would appear if the guest had accessed memory rather than I/O.
> >> 
> >> I think this is confusing, because now userspace authors have
> >> to figure out how to get back to "value X of size Y at address Z"
> >> by interpreting this text... Can you write out the equivalent of
> >> Christoffer's text "here's how you get the memory transaction
> >> value" for what you want?
> > 
> > Userspace swaps the value if and only if userspace's endianness differs
> > from the endianness with which the device interprets the data
> > (regardless of whether said interpretation is considered natural or
> > swapped relative to the way the bus is documented).  It's similar to how
> > userspace would handle emulating DMA.
> > 
> > KVM swaps the value if and only if the endianness of the guest access
> > differs from that of the host, i.e. if it would have done swapping when
> > emulating an ordinary memory access.
> > 
> >> (Also, value as it would appear to who?)
> > 
> > As it would appear to anyone.  It works because data[] actually is
> > memory.  Any difference in how data appears based on the reader's
> > context would already be reflected when the reader performs the load.
> > 
> >> I think your wording implies that the order of bytes in data[] depend
> >> on the guest CPU "usual byte order", ie the order which the CPU
> >> does not do a byte-lane-swap for (LE for ARM, BE for PPC),
> >> and it would mean it would come out differently from
> >> my/Alex/Christoffer's proposal if the host kernel was the opposite
> >> endianness from that "usual" order.
> > 
> > It doesn't depend on "usual" anything.  The only thing it implicitly
> > says about guest byte order is that it's KVM's job to implement any
> > swapping if the endianness of the guest access is different from the
> > endianness of the host kernel access (whether it's due to the guest's
> > mode, the way a page is mapped, the instruction used, etc).
> > 
> >> Finally, I think it's a bit confusing in that "as if the guest had
> >> accessed memory" is assigning implicit semantics to memory
> >> in the emulated system, when memory is actually kind of outside
> >> KVM's purview because it's not part of the CPU.
> > 
> > That's sort of the point.  It defines it in a way that is independent of
> > the CPU, and thus independent of what endianness the CPU operates in.
> 
> Ok, let's go through the combinations for a 32-bit write of 0x01020304 on PPC and what data[] looks like
> 
> your proposal:
> 
>   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>   BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>   LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
> 
> -> ldw_p() will give us the correct value to work with
> 
> current proposal:
> 
>   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>   BE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>   LE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
> 
> -> *(uint32_t*)data will give us the correct value to work with
> 
> 
> There are pros and cons for both approaches.
> 
> Pro approach 1 is that it fits the way data[] is read today, so no QEMU changes are required. However, it means that user space needs to have awareness of the "default endianness".
> With approach 2 you don't care about endianness at all anymore - you just get a payload that the host process can read in.
> 
> Obviously both approaches would work as long as they're properly defined :).
> 
Just to clarify, with approach 2 existing supported QEMU configurations
of BE/LE on both ARM and PPC still work - it is only for future mixed
endian supprt we need to modify QEMU, right?

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux