On Sat, Jan 25, 2014 at 03:15:35AM +0100, Alexander Graf wrote: > > On 25.01.2014, at 02:58, Scott Wood <scottwood@xxxxxxxxxxxxx> wrote: > > > On Sat, 2014-01-25 at 00:24 +0000, Peter Maydell wrote: > >> On 24 January 2014 23:51, Scott Wood <scottwood@xxxxxxxxxxxxx> wrote: > >>> On Fri, 2014-01-24 at 15:39 -0800, Christoffer Dall wrote: > >>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > >>>> index 366bf4b..6dbd68c 100644 > >>>> --- a/Documentation/virtual/kvm/api.txt > >>>> +++ b/Documentation/virtual/kvm/api.txt > >>>> @@ -2565,6 +2565,11 @@ executed a memory-mapped I/O instruction which could not be satisfied > >>>> by kvm. The 'data' member contains the written data if 'is_write' is > >>>> true, and should be filled by application code otherwise. > >>>> > >>>> +The 'data' member byte order is host kernel native endianness, regardless of > >>>> +the endianness of the guest, and represents the the value as it would go on the > >>>> +bus in real hardware. The host user space should always be able to do: > >>>> +<type> val = *((<type> *)mmio.data). > >>> > >>> Host userspace should be able to do that with what results? It would > >>> only produce a directly usable value if host endianness is the same as > >>> the emulated device's endianness. > >> > >> With the result that it gets the value the CPU has sent out on > >> the bus as the memory transaction. > > > > Doesn't that assume the host kernel endianness is the same as the bus > > (or rather, that the host CPU would not swap such an access before it > > hits the bus)? > > > > If you take the same hardware and boot a little endian host kernel one > > day, and a big endian host kernel the next, the bus doesn't change, and > > neither should the bytewise (assuming address invariance) contents of > > data[]. How data[] would look when read as a larger integer would of > > course change -- but that's due to how you're reading it. > > > > It's clear to say that a value in memory has been stored there in host > > endianness when the value is as you would want to see it in a CPU > > register, but it's less clear when you talk about it relative to values > > on a bus. It's harder to correlate that to something that is software > > visible. > > > > I don't think there's any actual technical difference between your > > wording and mine when each wording is properly interpreted, but I > > suspect my wording is less likely to be misinterpreted (I could be > > wrong). > > > >> Obviously if what userspace > >> is emulating is a bus which has a byteswapping bridge or if it's > >> being helpful to device emulation by providing "here's the value > >> even though you think you're wired up backwards" then it needs > >> to byteswap. > > > > Whether the emulated bus has "a byteswapping bridge" doesn't sound like > > something that depends on the endianness that the host CPU is currently > > running in. > > > >>> How about a wording like this: > >>> > >>> The 'data' member contains, in its first 'len' bytes, the value as it > >>> would appear if the guest had accessed memory rather than I/O. > >> > >> I think this is confusing, because now userspace authors have > >> to figure out how to get back to "value X of size Y at address Z" > >> by interpreting this text... Can you write out the equivalent of > >> Christoffer's text "here's how you get the memory transaction > >> value" for what you want? > > > > Userspace swaps the value if and only if userspace's endianness differs > > from the endianness with which the device interprets the data > > (regardless of whether said interpretation is considered natural or > > swapped relative to the way the bus is documented). It's similar to how > > userspace would handle emulating DMA. > > > > KVM swaps the value if and only if the endianness of the guest access > > differs from that of the host, i.e. if it would have done swapping when > > emulating an ordinary memory access. > > > >> (Also, value as it would appear to who?) > > > > As it would appear to anyone. It works because data[] actually is > > memory. Any difference in how data appears based on the reader's > > context would already be reflected when the reader performs the load. > > > >> I think your wording implies that the order of bytes in data[] depend > >> on the guest CPU "usual byte order", ie the order which the CPU > >> does not do a byte-lane-swap for (LE for ARM, BE for PPC), > >> and it would mean it would come out differently from > >> my/Alex/Christoffer's proposal if the host kernel was the opposite > >> endianness from that "usual" order. > > > > It doesn't depend on "usual" anything. The only thing it implicitly > > says about guest byte order is that it's KVM's job to implement any > > swapping if the endianness of the guest access is different from the > > endianness of the host kernel access (whether it's due to the guest's > > mode, the way a page is mapped, the instruction used, etc). > > > >> Finally, I think it's a bit confusing in that "as if the guest had > >> accessed memory" is assigning implicit semantics to memory > >> in the emulated system, when memory is actually kind of outside > >> KVM's purview because it's not part of the CPU. > > > > That's sort of the point. It defines it in a way that is independent of > > the CPU, and thus independent of what endianness the CPU operates in. > > Ok, let's go through the combinations for a 32-bit write of 0x01020304 on PPC and what data[] looks like > > your proposal: > > BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 } > LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 } > BE guest, LE host: { 0x01, 0x02, 0x03, 0x04 } > LE guest, LE host: { 0x04, 0x03, 0x02, 0x01 } > > -> ldw_p() will give us the correct value to work with > > current proposal: > > BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 } > LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 } > BE guest, LE host: { 0x04, 0x03, 0x02, 0x01 } > LE guest, LE host: { 0x01, 0x02, 0x03, 0x04 } > > -> *(uint32_t*)data will give us the correct value to work with > > > There are pros and cons for both approaches. > > Pro approach 1 is that it fits the way data[] is read today, so no QEMU changes are required. However, it means that user space needs to have awareness of the "default endianness". > With approach 2 you don't care about endianness at all anymore - you just get a payload that the host process can read in. > > Obviously both approaches would work as long as they're properly defined :). > Just to clarify, with approach 2 existing supported QEMU configurations of BE/LE on both ARM and PPC still work - it is only for future mixed endian supprt we need to modify QEMU, right? -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html