Re: [Qemu-ppc] KVM and variable-endianness guest CPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23.01.2014, at 05:25, Victor Kamensky <victor.kamensky@xxxxxxxxxx> wrote:

> Hi Alex,
> 
> Sorry, for delayed reply, I was focusing on discussion
> with Peter. Hope you and other folks may get something
> out of it :).
> 
> Please see responses inline
> 
> On 22 January 2014 02:52, Alexander Graf <agraf@xxxxxxx> wrote:
>> 
>> On 22.01.2014, at 08:26, Victor Kamensky <victor.kamensky@xxxxxxxxxx> wrote:
>> 
>>> On 21 January 2014 22:41, Alexander Graf <agraf@xxxxxxx> wrote:
>>>> 
>>>> 
>>>> "Native endian" really is just a shortcut for "target endian"
>>>> which is LE for ARM and BE for PPC. There shouldn't be
>>>> a qemu-system-armeb or qemu-system-ppc64le.
>>> 
>>> I disagree. Fully functional ARM BE system is what we've
>>> been working on for last few months. 'We' is Linaro
>>> Networking Group, Endian subteam and some other guys
>>> in ARM and across community. Why we do that is a bit
>>> beyond of this discussion.
>>> 
>>> ARM BE patches for both V7 and V8 are already in mainline
>>> kernel. But ARM BE KVM host is broken now. It is known
>>> deficiency that I am trying to fix. Please look at [1]. Patches
>>> for V7 BE KVM were proposed and currently under active
>>> discussion. Currently I work on ARM V8 BE KVM changes.
>>> 
>>> So "native endian" in ARM is value of CPSR register E bit.
>>> If it is off native endian is LE, if it is on it is BE.
>>> 
>>> Once and if we agree on ARM BE KVM host changes, the
>>> next step would be patches in qemu one of which introduces
>>> qemu-system-armeb. Please see [2].
>> 
>> I think we're facing an ideology conflict here. Yes, there
>> should be a qemu-system-arm that is BE capable.
> 
> Maybe it is not ideology conflict but rather terminology clarity
> issue :). I am not sure what do you mean by "qemu-system-arm
> that is BE capable". In qemu build system there is just target
> name 'arm', which is ARM V7 cpu in LE mode, and 'armeb'
> target which is ARM V7 cpu in BE mode. That is true for a lot
> of open source packages. You could check [1] patch that
> introduces armeb target into qemu. Build for
> arm target produces qemu-system-arm executable that is
> marked 'ELF 32-bit LSB executable' and it could run on LE
> traditional ARM Linux. Build for armeb target produces
> qemu-system-armeb executable that is marked 'ELF 32-bit
> MSB executable' that can run on BE ARM Linux. armbe is
> nothing special here, just build option for qemu that should run
> on BE ARM Linux.

But why should it be called armbe then? What actual difference does the model have compared to the qemu-system-arm model?

> 
> Both qemu-system-arm and qemu-system-armeb should
> be BE/LE capable. I.e either of them along with KVM could
> either run LE or BE guest. MarcZ demonstrated that this
> is possible. I've tested both LE and BE guests with
> qemu-system-arm running on traditional LE ARM Linux,
> effectively repeating Marc's setup but with qemu.
> And I did test with my patches both BE and LE guests with
> qemu-system-armeb running on BE ARM Linux.
> 
>> There
>> should also be a qemu-system-ppc64 that is LE capable.
>> But there is no point in changing the "default endiannes"
>> for the virtual CPUs that we plug in there. Both CPUs are
>> perfectly capable of running in LE or BE mode, the
>> question is just what we declare the "default".
> 
> I am not sure, what you mean by "default"? Is it initial
> setting of CPSR E bit and 'cp15 c1, c0, 0' EE bit? Yes,
> the way it is currently implemented by committed
> qemu-system-arm, and proposed qemu-system-armeb
> patches, they are both off. I.e even qemu-system-armeb
> starts running vcpu in LE mode, exactly by very similar
> reason as desribed in your next paragraph
> qemu-system-armeb has tiny bootloader that starts
> in LE mode, jumps to kernel kernel switches cpu to
> run in BE mode 'setend be' and EE bit is set just
> before mmu is enabled.

You're proving my point even more. If both targets are LE/BE capable and both targets start execution in LE mode, then why do we need a qemu-system-armbe at all? Just use qemu-system-arm.
> 

>> Think about the PPC bootstrap. We start off with a
>> BE firmware, then boot into the Linux kernel which
>> calls a hypercall to set the LE bit on every interrupt.
> 
> We have very similar situation with BE ARM Linux.
> When we run ARM BE Linux we start with bootloader
> which is LE and then CPU issues 'setend be' very
> soon as it starts executing kernel code, all secondary
> CPUs issue 'setend be' when they go out of reset pen
> or bootmonitor sleep.
> 
>> But there's no reason this little endian kernel
>> couldn't theoretically have big endian user space running
>> with access to emulated device registers.
> 
> I don't want to go there, it is very very messy ...
> 
> ------ Just a side note: ------
> Interestingly, half a year before I joined Linaro in Cisco I and
> my colleague implemented kernel patch that allowed to run
> BE user-space processes as sort of separate personality on
> top of LE ARM kernel ... treated kind of multi-abi system.
> Effectively we had to do byteswaps on all non-trivial
> system calls and ioctls in side of the kernel. We converted
> around 30 system calls and around 10 ioctls. Our target process
> was just using those and it works working, but patch was
> very intrusive and unnatural. I think in Linaro there was
> some public version of my presentation circulated that
> explained all this mess. I don't want seriously to consider it.
> 
> The only robust mixed mode, as MarcZ demonstrated,
> could be done only on VM boundaries. I.e LE host can
> run BE guest fine. And BE host can run LE guest fine.
> Everything else would be a huge mess. If we want to
> start pro and cons of different mixed modes we need to
> start separate thread.
> ------ End of side note ------------

Just because we don't do it on Linux doesn't mean some random guest can't do it. What if my RTOS of choice decides it wants to run half of its user space in little and the other half in big endian? What if my guest is actually an AMP system with an LE and a BE OS running side by side?

We shouldn't design virtualization just for the single use case we have in mind.

> 
>> As Peter already pointed out, the actual breakage behind
>> this is that we have a "default endianness" at all. But that's
>> a very difficult thing to resolve and I don't think should be
>> our primary goal. Just live with the fact that we declare
>> ARM little endian in QEMU and swap things
>> accordingly - then everyone's happy.
> 
> I disagree with Peter's point of view as you saw from our
> long thread :). I strongly believe that current mmio.data[]
> describes data on the bus perfectly fine with array of bytes.
> data[0] goes into phys_addr, data[1] goes into phys_addr + 1,
> etc.

mmio.data[] is really just a transport between KVM and QEMU (or kvmtool if you can't work on ARM instruction set simulators). There is no point in overengineering anything here. We should do what's the most natural fit for everything.

> Please check "Differences between BE-32 and BE-8 buses"
> section in [2]. In modern ARM CPU memory bus is byte invariant (BE-8).
> As data lines bytes view concerns, it is the same between LE and
> BE-8 that is why IMHO array of bytes view is very good choice.
> PPC and MIPS CPUs memory buses are also byte invariant, they
> always been that way. I don't think we care about BE-32. So
> for all practical purposes, mmio structure is BE-8 bus emulation,
> where data signals could be defined by array of bytes. If one
> would try to define it as set of other bigger integers
> one need to have endianness attribute associated with it. If
> such attribute implied by default just through CPU type in order to
> work with existing cases it should be different for different CPU
> types, which means qemu running in the same endianity but
> on different CPU types should acts differently if it emulates
> the same device and that is bad IMHO. So I don't see any
> value from departing from bytes array view of data on the bus.

The bus comes after QEMU is involved. From a semantic perspective, the KVM ioctl interface sits between the core and the bus. KVM implements the core and some bits of the bus (to emulate in-kernel devices) and QEMU implements the actual bus topology.

What happens on the way from core -> device is bus specific, so QEMU should take care of this. The way the QEMU internal bus representation gets MMIO data from the CPU is by an ( address, value, len ) tuple. So that's the interface we should design for. And that means we need to transfer full "values", not arrays of data.

The fact that we have a data array is really just because it was easy to write and access. In reality and for all intents and purposes this is a union of u8, u16, u32, u64 that we can't change to be a union anymore because we need to stay backwards compatible. And as with any normal kernel interface you design that one with the endianness of the kernel ABI.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux