On Thu, Apr 07, 2005 at 08:14:06AM -0400, Greg Weeks wrote: > What's the performance hit for doing a pref on a cache line that is > already pref'd? A wasted instruction. (More complicated on certain multi-issue in-order processors such as the SB1 CPU core. Mentioning this for completeness; we shouldn't worry about it here.) > Does it turn into a nop, or do we get some horrible > degenerate case? Are 64 bit processors always at least 32 byte cache > line size? The smallest D-cache line I know of is 16 bytes. > I don't really expect anyone to know the answers right now. I > expect I'll need to time code to tell. This makes generating them at run > time look better and better. Indeed. Initially when we started doing such things some people felt it might be really bad to debug and everything but in practice it's been a relativly minor problem, so I guess the resistance against yet another run-time generated group of functions is getting less. One interesting issue to solve - memcpy, memmove and copy_user are combined into a single big function, so the fixups for userspace accesses need to be handled at runtime as well. Ralf