On 06/24/2014 08:41 PM, Alexander Graf wrote: > > On 24.06.14 12:11, Alexey Kardashevskiy wrote: >> On 06/21/2014 09:12 AM, Benjamin Herrenschmidt wrote: >>> On Thu, 2014-06-19 at 21:21 -0600, Alex Williamson wrote: >>> >>>> Working on big endian being an accident may be a matter of perspective >>> :-) >>> >>>> The comment remains that this patch doesn't actually fix anything except >>>> the overhead on big endian systems doing redundant byte swapping and >>>> maybe the philosophy that vfio regions are little endian. >>> Yes, that works by accident because technically VFIO is a transport and >>> thus shouldn't perform any endian swapping of any sort, which remains >>> the responsibility of the end driver which is the only one to know >>> whether a given BAR location is a a register or some streaming data >>> and in the former case whether it's LE or BE (some PCI devices are BE >>> even ! :-) >>> >>> But yes, in the end, it works with the dual "cancelling" swaps and the >>> overhead of those swaps is probably drowned in the noise of the syscall >>> overhead. >>> >>>> I'm still not a fan of iowrite vs iowritebe, there must be something we >>>> can use that doesn't have an implicit swap. >>> Sadly there isn't ... In the old day we didn't even have the "be" >>> variant and readl/writel style accessors still don't have them either >>> for all archs. >>> >>> There is __raw_readl/writel but here the semantics are much more than >>> just "don't swap", they also don't have memory barriers (which means >>> they are essentially useless to most drivers unless those are platform >>> specific drivers which know exactly what they are doing, or in the rare >>> cases such as accessing a framebuffer which we know never have side >>> effects). >>> >>>> Calling it iowrite*_native is also an abuse of the namespace. >>> >>>> Next thing we know some common code >>>> will legitimately use that name. >>> I might make sense to those definitions into a common header. There have >>> been a handful of cases in the past that wanted that sort of "native >>> byte order" MMIOs iirc (though don't ask me for examples, I can't really >>> remember). >>> >>>> If we do need to define an alias >>>> (which I'd like to avoid) it should be something like vfio_iowrite32. >> >> Ping? >> >> We need to make a decision whether to move those xxx_native() helpers >> somewhere (where?) or leave the patch as is (as we figured out that >> iowriteXX functions implement barriers and we cannot just use raw >> accessors) and fix commit log to explain everything. > > Is there actually any difference in generated code with this patch applied > and without? I would hope that iowrite..() is inlined and cancels out the > cpu_to_le..() calls that are also inlined? iowrite32 is a non-inline function so conversions take place so are the others. And sorry but I fail to see why this matters. We are not trying to accelerate things, we are removing redundant operations which confuse people who read the code. -- Alexey -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html