> From: Russell King - ARM Linux [mailto:linux@xxxxxxxxxxxxxxxx] > Sent: Tuesday, October 13, 2015 7:24 AM <snip> > If you implement drivers using nothing but writel() and readl(), then your > performance _will_ suck, but that's entirely the driver's fault. Your above analysis seems correct. Perhaps it is wrongheaded but part of the rationalization I used in the past was many of the ARM SW drivers evolved from low performance bus architectures which didn't punish drivers for forgetting to use a barrier or a read back. Many driver porters assumed what was there was good and built atop that. The result was a lot of hidden issues during production ramps. This is a mix of errata, missing read backs, wrong macro choice (and many valid macro usage instances). A couple SOCs I sampled in the past just used StronlgyOrdered and didn't buffer. This created a lot of 'it works for me not sure of your problem is' inputs. In that environment the question was realized performance vs. correctness. The promotion in heaviness for some of the valid macro usages tended to not be an issue as they are sparse. In cases they were not were in places where a DMA engine should have been in use anyway. The end result of promotion was the ability to work around many of the bad with one knob. I recall consulting Linux PowerPC folks (production users of weak memory model) in that time frame and they indicated they over synchronized also. I don't know what they do today. Today, maybe the code has been refactored/evolved enough that the older issues have been boiled away but this seems a bit optimistic given history. Regards, Richard W. -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html