Unfortunately often operating system virtual memory and filesystem caching code that does exactly the opposite of what a database application would like. For some reason the kernel guys don't see it that way ;) Over the years there have been various kernel features added with the overall goal of solving problems in this area : O_DIRECT, ckrm, flags to mmap() and so on. So far I'm not sure any of them has really succeeded. Hence the real-world wisdom to 'let the filesystem cache do its thing' and configure a small shared memory cache in the application. Ideally one would want to use O_DIRECT or equivalent to bypass the OS's cleverness and manage the filesystem caching oneself. However it turns out that enabling O_DIRECT makes things much worse not better (YMMV obviously). It's hard to achieve the level of concurrency that the kernel can get for disk I/O, from user mode. Another approach is to make the application cache size dynamic, with the goal that it can grow and shrink to reach the size that provides the best overall performance. I've seen attempts to drive the sizing using memory access latency measurements done from user mode inside the application. However I'm not sure that anyone has taken this approach beyond the science project stage. So AFAIK this is still a generally unsolved problem. NT (Windows) is particularly interesting because it drives the filesystem cache sizing with a signal that it mesures from the VM pages evicted per second counter. In order to keep its feedback loop stable, the OS wants to see a non-zero value for this signal at all times. So you will see that even under ideal conditions the system will still page a little. (Unless that code has changed in Win2003 -- it's been a while since I checked). So don't drive yourself crazy trying to get it to stop paging ;)