On Thu, Oct 03, 2013 at 04:29:52PM +0200, Greg Kurz wrote: > Hi, > > There have been some work on the topic lately but no agreement has > been reached yet. I want to consolidate the facts in a single thread of > mail and re-start the discussion. Please find below a recap of what we > have as of today: > > From a virtio POV, guest endianness is reflected by the endianness of > the interrupt vectors (ILE bit in the LPCR register). The guest kernel > relies on the H_SET_MODE_RESOURCE_LE hcall to set this bit, early in the > boot process. > > Rusty sent a patchset on qemu-devel@ to provide the necessary bits to > perform byteswap in the QEMU: > > http://patchwork.ozlabs.org/patch/266451/ > http://patchwork.ozlabs.org/patch/266452/ > http://patchwork.ozlabs.org/patch/266450/ > (plus other enablement patches for virtio drivers, not essential for > the discussion). > > In non-KVM mode, QEMU implements the H_SET_MODE_RESOURCE_LE and updates > its internal value for LPCR when the guest requests it. Rusty's patchset > works out-of-the-box in this mode: I could successfully setup and use a > 9p share over virtio transport (broader virtio testing still to be done > though). > > When using KVM, the story is different : QEMU is not on this > endianness change flow anymore, providing KVM has the following > patch from Anton: > > http://patchwork.ozlabs.org/patch/277079/ > > There are *at least* two approaches to bring back endianness knowledge > to QEMU: polling (1) and propagation (2). > > (1) QEMU must retrieve LPCR from the kernel using the following API: > > http://patchwork.ozlabs.org/patch/273029/ > > (2) KVM can resume execution to the host and thus propagating > H_SET_MODE_RESOURCE_LE to QEMU. Laurent came up with a patch on > linuxppc-dev@ to do this: > > http://patchwork.ozlabs.org/patch/278590/ > > I would say (1) is a standard and sane way of addressing the issue: > since the LPCR register value is held by KVM, it makes sense to > introduce an API to get/set it. Then, it is up to QEMU to use this API. > > We can dumbly do the polling in all the places where byteswapping > matters: it is clearly sub-optimized, especially since the LPCR_ILE bit > doesn't change so often. Rusty suggested we can retrieve it at virtio > device reset time and cache it, since an endianness change after the > devices have started to be used is non-sensical. > > I have searched for an appropriate place to add the polling and I must > admit I did not find any... I am no QEMU expert but I suspect we would > need some kind of arch specific hook to be called from the virtio code > to do this... :-\ I hope I am wrong, please correct me if so. > > On the other hand, (2) looks a bit hacky: KVM usually returns to the > host when it cannot fully handle the h_call. Propagating may look like > a useless path to follow from a KVM POV. From a QEMU POV, things are > different: propagation will trig the fallback code in QEMU, already > working in non-KVM mode. Nothing more to be done. I don't mind particularly whether H_SET_MODE for the endianness setting gets handled in the kernel or in QEMU, but I don't think it should be handled in both. If you want QEMU to know about the endianness setting immediately, make the kernel version do nothing and get QEMU to handle it -- which if KVM is enabled will mean iterating over all vcpus and getting them all to send the new LPCR setting to the kernel via the SET_ONE_REG ioctl. However, I want the setting of breakpoint registers (CIABR and DAWR/X) via H_SET_MODE to happen in the kernel, preferably in real mode, since that can happen on context switch and thus needs to be quick. Regards, Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html