Maciej W. Rozycki (macro@xxxxxxxxxxxxxx) writes: > Besides new CPUs more often than not > require changes to kernel-level software anyway. Making sure that isn't so is the reason why there's a MIPS32/64 spec (with all the privileged operations defined). Which also avoids the undesirable development step of new hardware combined with new kernel software... > > How did you measure the high throughput? Have you got a > > machine with DMA-coherency you can turn on and off? > > I just disabled invalidations. ;-) Ouch. So the effect could have come from a variety of sources. > That was an R4400 with 1MB of S-cache. With an R4400 S-cache, any difference between "would write it back but it's clean" and "just invalidate" is likely to be small, since in either case the time will be dominated by the (external) cache tag memory RMW operation. > Eventually I should benchmark both invalidation variations against each > other with the system in question and see if it makes any difference. Indeed. And it might also be a good idea to test a more modern system, too, to see how big an effect this might be. > Ironically this is where the write-back cache of the R4k gives loss > rather than gain as compared to the write-through cache of the R3k > (the system supports daughtercards with either CPU, so useful > comparison is possible)... Maybe. But remember, on the R3K every write was a write through, and they all had a cost in bus congestion, which may have delayed a following read and held up the CPU (or the write buffer may have filled and stalled the CPU). I think up to about 33MHz write-through remained a tolerable policy for 1988-era memory systems; any faster than that and you just sank under a flood of writes. 2005-era memory systems are much faster when bursting, but the time they take to process a single write cycle has improved by less than 2x. So write-through is still a really bad idea for 100MHz CPUs using off-chip memory. Even when your device requires you to push out all the data it can be more efficient to write data to the cache and then force writeback to memory: at least that way the data goes to the memory in efficient burst cycles. -- Dominic