On Wed, Oct 05, 2016 at 11:33:59AM -0600, Jason Gunthorpe wrote: > Right, stated differently, the kernel requires that writel()/etc > always produce the same PCI-E packet on the wire. (eg writel(1) > produces a TLP with bit 0 of the data payload set) Exactly. > Not going through the kernel's writel is the whole problem. The writel > helper generates the arch-specific instruction sequence required to > issue generate the required PCI-E packet. All the architectures (including your quoted above ARM example) seem to do the byte swap in software. That seems to be important for accesses like memcpy_{to,from}_io, which would be painful to handle. But yes, in theory an architecture could do it any way it wants. > Today (at best, some drivers do not even do this) our userspace > assumes all archs implement writel as: > > *(u32 *)reg = cpu_to_le32(val); > > Which is a good start, but not portable to every arch the kernel > supports. It should do the right thing for every architecture that matters. Thay beeing said having an iomem abstraction certainly makes sense for various reasons, and handling any oddball architecture (or rather PCI hostbridge implemtation, I would not expect something this broken to be universal) would come as a bonus. > Yeah, it would be nice to get that working too. I guess we need to > standardize on the cpu_to_xx macro style as a first step? Any style will work as long as it separate the swap directions, but for a low-level Linux projects using the kernel style certainly makes sene. > The __iomem annotation would be nice as well. I can look into that as well. Usuaully the first step before adding sparse annotations is fixing all the misc sparse warnings, as typical userspace projects have a not too stellar code quality. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html