> We'd also have to make sure that the comparison is between the linux-omap > kernel and the OMAPZoo kernel, rather than o-z PIO vs. o-z DMA. The > OMAPZoom kernel doesn't post any device register writes. That should > cause any driver using PIO to drag, compared to the l-o kernel. There effetely are a few levels of posting. The patch stops _ARM_ PIO to peripheral control registers from hanging out in the interconnect. Other levels still are there. The big winner of interconnect from ARM side is DDR operations which is still gets posted at all levels. The size of buffering in interconnect is like a couple cache lines so it will fill up pretty fast. DMA and other initiators are not impacted as they don't use arm-mmu attributes. Really, if you think about it "big" block writes may not be impacted either as you will back up at the slower device's speed. Cache is a familiar example. If you write out 100M quickly you will get to a point where you are bottlenecked on main memory speed very quickly (every write is a miss or cast out at some point). The fact you have a cache in the way doesn't matter. It can make some difference if you're intermixing some small PIOs with other work. In general benchmarks I've yet to see the hit on system against all the bigger noises. Probably I can construct a case where ~10% is lost. I recall some tests which have dma + prefetch working on nand. I'll see if I can dig them up. I've was at several meetings a year back where different memory vendors come in and showed with some tweak they can get 2x l-o on flash. Then if you also take their device optimized file system you will get like 5x. Hopefully at least the in tree 2x is gone now that it's a year later. Regards, Richard W. -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html