Re: KVM and variable-endianness guest CPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23 January 2014 12:45, Christoffer Dall <christoffer.dall@xxxxxxxxxx> wrote:
> On Thu, Jan 23, 2014 at 08:25:35AM -0800, Victor Kamensky wrote:
>> On 23 January 2014 07:33, Peter Maydell <peter.maydell@xxxxxxxxxx> wrote:
>> > On 23 January 2014 15:06, Victor Kamensky <victor.kamensky@xxxxxxxxxx> wrote:
>> >> In [1] I wrote
>> >>
>> >> "I don't see why you so attached to desire to describe
>> >> data part of memory transaction as just one of int
>> >> types. If we are talking about bunch of hypothetical
>> >> cases imagine such bus that allow transaction with
>> >> size of 6 bytes. How do you describe such data in
>> >> your ints speak? What endianity you can assign to
>> >> sequence of 6 bytes? While note that description of
>> >> such transaction as set of 6 byte values at address
>> >> $whatever makes perfect sense."
>> >>
>> >> But notice that in your next reply [2] you just dropped it
>> >
>> > Yes. This is because it was one of the places where
>> > I would have just had to repeat "no, I'm afraid you're wrong
>> > about how hardware works". I think in general it's going
>> > to be better if I don't try to reply point by point to this
>> > email; I think you should go back and reread the emails I've
>> > sent. Key points:
>> >  (1) hardware is not doing anything involving arrays
>> >      of bytes
>>
>> Array of bytes or integers is just a way to describe data lines
>> on the bus. Did you look at this document?
>>
>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0290g/ch06s05s01.html
>>
>> A0, A1, ,,, A7 byte values are the same for both LE and BE-8
>> case (first two columns in the table) and they unambiguously
>> describe data bus signals
>>
>
> The point is simple, and Peter has made it over and over:
> Any consumer of a memory operation sees "value, len, address".

and "endianess" of operation.

here is memory operation

*(int *) (0x1000) = 0x01020304;

can you tell how memory will look like at 0x1000 address - you can't
in LE it will look one way in BE byteswapped.

> This is what KVM_EXIT_MMIO emulates.  So just by knowing the ABI
> definition and having a pointer to the structure you need to be able to
> tell me "value, len, address".
>
>> >  (2) the API between kernel and userspace needs to define
>> >      the semantics of mmio.data, ie how to map between
>> >      "x byte wide transaction with value v" and the array,
>> >      and that is primarily what this conversation is about
>> >  (3) the only choice which is both (a) sensible and (b)
>> >      not breaking existing usage is to say "the array is
>> >      in host-kernel-byte-order"
>> >  (4) PPC CPUs in BE mode and ARM CPUs in BE mode are not
>> >      the same, because in the ARM case it is doing an
>> >      internal-to-CPU byteswap, and in the PPC case it is not
>>
>> That is one of the key disconnects. I'll go find real examples
>> in ARM LE, ARM BE, and PPC BE Linux kernel. Just for
>> everybody sake's here is summary of the disconnect:
>>
>> If we have the same h/w connected to memory bus in ARM
>> and PPC systems and we have the following three pieces
>> of code that work with r0 having same device same
>> register address:
>>
>> 1. ARM LE word write of  0x04030201:
>> setend le
>> mov r1, #0x04030201
>> str r1, [r0]
>>
>> 2. ARM BE word write of 0x01020304:
>> setend be
>> mov r1, #0x01020304
>> str r1, [r0]
>>
>> 3. PPC BE word write of 0x01020304:
>> lis     r1,0x102
>> ori     r1,r1,0x304
>> stw    r1,0(r0)
>>
>> I claim that h/w will see the same data on bus lines in all
>> three cases, and h/w would acts the same in all three
>> cases. Peter says that ARM BE and PPC BE case h/w
>> would act differently.
>>
>> If anyone else can offer opinion on that while I am looking
>> for real examples that would be great.
>>
>
> I really don't think listing all these examples help.

I think Peter is wrong in his understanding how real
BE PPC kernel drivers work with h/w mapped devices. Going
with such misunderstanding to suggest how it should hold
info in emulated mmio case is quite strange.

> You need to focus
> on the key points that Peter listed in his previous mail.
>
> I tried in our chat to ask you this questions:
>
> vcpu_data_host_to_guest() is handling a read from an emulated device.
> All the info you have is:
> (1) len of memory access
> (2) mmio.data pointer
> (3) destination register
> (4) host CPU endianness
> (5) guest CPU endianness
>
> Based on this information alone, you need to decide whether you do a
> byteswap or not before loading the hardware register upon returning to
> the guest.
>
> You will find it impossible to answer, because you don't know the layout
> of mmio.data, and that is the thing we are trying to solve.

Actually I am not arguing with above. I agree that
meaning of mmio.data should be better clarified.

I propose my clarification as array of bytes at
phys_addr address on BE-8,
byte invariant, memory bus. That unambiguously
describes data bus signals in case of BE-8 memory
bus. Please look at

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0290g/ch06s05s01.html

first two columns LE and BE-8, If one will specify all
values for A0, A1, ... A7 it will define all bus signals.
Note that is the only endian agnostic way to describe data
bus signals. If one would try to describe them in
half-word[s], word[s], double word one need to tell what
endianity of those integers (other columns in document
table).

Peter claims that "I don't understand how h/w bus works".
I disagree with that. I gave pointer on document that describes
how BE-8, byte invariant, memory bus works. I would
appreciate pointer to document, section and page that
describes Peter's memory bus operation understanding.

I pointed that Peter's proposal would have the following issue:
BE qemu will have to act differently depending on CPU
type while emulating the same device. If Peter's proposal is
accepted n qemu code should do something like:

#ifdef WORD_BIGENDIAN
#ifdef __PPC_
   do one thing
#else
  do another
#endif
#endif

there reason for that because the same device write in mmio
will look like this:

ARM LE mmio.data[] = { 0x01, 0x02, 0x03, 0x04 }
ARM BE mmio.data[] = { 0x04, 0x03, 0x02, 0x01 }
PPC LE mmio.data[] = { 0x04, 0x03, 0x02, 0x01 }
PPC BE mmio.data[] = { 0x01, 0x02, 0x03, 0x04 }

for ARM BE and PPC BE arrays are different even
it is just BE case, so code would need to '#if ARM'
thing

If you follow my proposal to clarify mmio.data[] meaning
mmio.data[] array will look the same in all 4 cases,
compatible with current usage.

If Peter's proposal is adopted ARM BE and PPC LE cases
would be penalized with excessive back and forth
byteswaps. That is possible to avoid with my proposal.

Thanks,
Victor

> If you cannot reply to this point in less than 50 lines or mention
> anything about devices being LE or BE or come with examples, I am
> probably not going to read your reply, sorry.
>
> -Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux