On Wed, May 12, 2010 at 1:36 AM, David Howells <dhowells@xxxxxxxxxx> wrote: > > Out of interest, does it make the code smaller if you mark > ioat2_get_ring_ent() and ioat2_ring_mask() with __attribute_const__? > > I'm not sure whether it'll affect how long gcc is willing to cache these, but > once computed, I would guess they won't change within the calling function. Unfortunately, it does not make a difference, but I'll keep this in mind if ioat2_get_ring_ent() ever gets more complicated (which it might in the future). > Also, is the device you're driving watching the ring and its indices? If so, > does it modify the indices? If that is the case, you might need to use > read_barrier_depends() rather than smp_read_barrier_depends(). The device does not observe the indices directly. Instead we increment a free running 'count' register by the distance between ioat->pending and ioat->head. > >> + prefetch(ioat2_get_ring_ent(ioat, idx + i + 1)); >> + desc = ioat2_get_ring_ent(ioat, idx + i); >> dump_desc_dbg(ioat, desc); >> tx = &desc->txd; >> if (tx->cookie) { > > Is this right, I wonder? You're prefetching [i+1] before reading [i]? Doesn't > this mean that you might have to wait for [i+1] to be retrieved from RAM before > [i] can be read? Should you instead read tx->cookie before issuing the > prefetch? Admittedly, this is only likely to affect the reading of the head of > the queue - subsequent reads in the same loop will, of course, have been > prefetched. Yes, it should be the other way around. Thanks! -- Dan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html