Nick Piggin wrote:
Right, but probably the large majority of wmb() callers actually just want io_wmb(). This would relieve much of the performance problem, I'd say. Of those that really want a wmb() and cannot be converted to io_wmb(), I don't think it is a good option to actually just weaken wmb() because we deem that doing what the caller asked for is too expensive.
Hi Nick, I believe there's a fair number of places where wmb() is used for memory ordering not related to IO.
I guess with the ia64_mf(), Altix probably does not reorder PCI stores past earlier cacheable stores, so _some_ wmb()s actually do not require the full mmiowb case (if we only need to order an earlier RAM store with a later PCI store). However, again, weakening wmb() is not a good option because it really requires an audit of the entire tree to do that.
Nope, unfortunately not, ia64_mf() isn't strong enough to prevent the reordering, it's done in the PCI controller, so in order to guarantee the the reording you have to go all the way out to the PCI controller, which is very slow.
We _could_ introduce partial barriers like store/iostore iostore/store, but really, I think the io_wmb is a pretty good first step, and I haven't actually seen any numbers indicating it would be a performance problem.
I must admit I am not 100% upto speed on the entire discussion, but I think the io_wmb() and friends did go around in the past and got shot down. Cheers, Jes -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html