I've been thinking of the case where a descriptor ring has to be in non-coherent memory (eg because that is all there is). The receive ring processing isn't actually that difficult. The driver has to fill a cache line full of new buffer descriptors in memory but without assigning the first buffer to the hardware. Then it has to do a cache line write of just that line. Then it can assign ownership of the first buffer and finally do a second cache line write. (The first explicit write can be skipped if the cache writes are known to be atomic.) It then must not dirty that cache line. To check for new frames it must invalidate the cache line that contains the 'next to be filled' descriptor and then read that cache line. This will contain info about one or more receive frames. But the hardware is still doing updates. But both these operations can be happening at the same time on different parts of the buffer. So you need to know a 'cache line size' for the mapping and be able to do writebacks and invalidates for parts of the buffer, not just all of it. The transmit side is harder. It either requires waiting for all pending transmits to finish or splitting a single transmit into enough fragments that its descriptors end on a cache line boundary. But again, and if the interface is busy, you want the cpu to be able to update one cache line of transmit descriptors while the device is writing transmit completion status to the previous cache line. I don't think that is materially different for non-coherent memory or bounce buffers. But partial flush/invalidate is needed. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)