On Thu, Mar 30, 2017 at 11:48 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > This is not going into the tree - it's just a "let's check your > theory about might_fault() overhead being the source of slowdown > you are seeing" quick-and-dirty patch. Note that for cached hdparm reads, I suspect a *much* bigger effects than the fairly cheap might_fault() tests is just the random layout of the data in the page cache. Memory is just more expensive than CPU is. The precise physical address that gets allocated for the page cache entries ends up mattering, and is obviously fairly "sticky" within one reboot (unless you have a huge working set and that flushes it, or you use something like echo 3 > /proc/sys/vm/drop_caches to flush filesystem caches manually). The reason things like page allocation matter for performance testing is simply that the CPU caches are physically indexed (the L1 might not be, but outer levels definitely are), and so page allocation ends up impacting caching unless you have very high associativity. And even if your workload doesn't fit in your CPU caches (I'd hope that the "cached" hdparm is still doing a fairly big area), you'll still see memory performance depend on physical addresses. Doing kernel performance testing without rebooting several times is generally very hard. Linus