>> I don't wonder if other IDT CPUs also require this, including those that >> conform MIPS32. >> Basically, requirement of uncached run makes hadrware logic much simpler >> and allows to save silicon a bit. > >That could be true, but then again I suggest making specific cache routines for those >CPUs. >It would be a real performance hit for the rest of us, if we have to operate from >uncached space. I pulled together the relevant code to generate a module to test this problem and it looks like the CPU always misses 1 instruction following the end of the cache loop. If I add some nop's to change the alignment of the code it doesn't seem to make any difference. The same thing seems to happen even if I change the cache flush to a 'Hit_invalidate' of some completely different memory region. One thing I thought might happen is the CPU ending the loop early as soon as it invalidates the cacheline containing the current instructions, but this doesn't seem to be the case, the 'end' address is always correct. Perhaps this really is a hardware problem. The test module below does a blast_icache then a few well known instructions and signifies if anything has been missed. I typically get the following on our board. Cacheop skipped 1 instructions, end = 0x80004000 The end address is correct, so the cache flush completes, but 1 instruction is missed. I would be interested to know if someone can test this on another mips32 processor since I don't have any others available. Simply adding an extra nop after the cache loop might be a good workaround for this board. Module compiled with: /tmp/crossdev/mips/bin/mips-linux-gcc -G 0 -mips2 -mno-abicalls -fno-pic -mlong-calls -fno-common -O2 -fno-strict-aliasing -I/usr/src/linux/include -Wall -DMODULE -D__KERNEL__ -fno-common -c -o test_tmp.o test.c /tmp/crossdev/mips/bin/mips-linux-ld -r -G0 -o test.o test_tmp.o #include <linux/module.h> #include <linux/init.h> #include <linux/sysctl.h> #include <asm/cacheops.h> #include <asm/bootinfo.h> #include <asm/cpu.h> #include <asm/bcache.h> #include <asm/page.h> #include <asm/system.h> #include <asm/addrspace.h> #define icache_size (16 * 1024) #define ic_lsize (16) #define cache_unroll(base,op) \ __asm__ __volatile__(" \ .set noreorder; \ .set mips3; \ cache %1, (%0); \ .set mips0; \ .set reorder" \ : \ : "r" (base), \ "i" (op)); static inline unsigned test_blast_icache(void) { unsigned long start = KSEG0; unsigned long end = (start + icache_size); while(start < end) { cache_unroll(start,Index_Invalidate_I); start += ic_lsize; } return start; } static int __init init(void) { int i = 4; unsigned int end; end = test_blast_icache(); __asm__( ".set push \n" ".set noreorder \n" " addu %0,-1 \n" " addu %0,-1 \n" " addu %0,-1 \n" " addu %0,-1 \n" ".set pop \n" : "=r" (i) : "r" (i)); printk("Cacheop skipped %u instructions, end = 0x%x\n", i, end); return 0; } static void __exit fini(void) { } module_init(init); module_exit(fini);