> > > > There're two possible optimization: > > 1. (Requires only the instruction that swaps caches must run uncached) > > CPU may skip implementation of double check of cache hit on loads. > > Scenario: mtc0 with cache swapping with ensuring next instructions are > > in cache > > (pipelining here!); swap occurs; must check again the instructions are > > in > > the cache because the same cacheline in the data cache may have valid > > bit set > > and CPU will get data instead of code. > > I can't really see a problem here for proper implementations. The CPU > may have fetched a few instructions beyond the mtc0 doing a cache swap. Load from memory into I-cache, setting the valid bit. > It's OK since we didn't modify the code. As long as the swap doesn't > complete, the CPU is using the real I-cache. Once it's completed, it uses > the D-cache. Since the new cache is used in the normal mode of operation, > now tag matches and line replacements occur here as if it was the real > I-cache. No need to do any extra checks at any stage. Have to check the cacheline at given address again. D-cache may have the valid bit set for the cacheline at the same address. Address means location in a cache, not memory. Check at address requires one extra tick as opposed to checking the bit. Please, note that CPU isn't a monolitic program, but rather a set of functional blocks, so "proper implementation" may require additional signals on wires and delays. > > It's possible they broke something, simply. My guess they implemented No. 1. more or less. Anybody from IDT here with strong willing to broke NDA ? :-) Regards, Gleb.