On Tue, Sep 23, 2003 at 12:35:44PM +0100, Dominic Sweetman wrote: > As usual, I guess the first thing is to try doing it the standard way > and then try to measure how much time is being spent in extra TLB misses > generated by your application. Some MIPS CPUs have "performance > counters" which might be able to count TLB misses, but you'll more > likely have to instrument the TLB miss code. > > If it does turn out that TLB replacement is a big drain: > > Most MIPS CPU hardware allows you to map large chunks of memory with a > single TLB entry: often up to 16Mbytes at a time. But I don't know > how you'd persuade Linux how to do that. As an indication at how effective large pagesize support can be for applications, take a look at the two USENIX 98 papers titled "General Purpose Operating System Support for Multiple Page Sizes" by SGI about IRIX and the "Implementation of Multiple Page Size support in HP-UX" presented on the same. Given that we have what QED once called the slowest TLB reload handler they've even seen the impact could be even stronger than demonstrated in these two papers. The implementation described has been condemened by Linus as stupid and unacceptable. I expect a conceptually different optmization on MIPS late this year. In any case the paper show how costly TLB exception handlers can be; the reason why I yell at about everybody who's mentioning the phrase "wired tlb entries". For the time being Linux has large page support for the kernel - read KSEG0 / KSEGX. Another optimization is also the use of the global bit for all kernel mappings and for 2.6 support for hugetlbfs on MIPS should also be fairly easy. Btw, again and again the MIPS r4k-style TLBs are a bit of a pain because each entry maps a pair of pages which share some of their attributes ... Ralf