Hi! On 12:16 Fri 15 Feb , Kevin Wilson wrote: ... > AFAIK, what prefetch does is get a variable from memory and put it in > cache (L2 cache I believe). Yes, this is true. See: http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Other-Builtins.html I am not so sure about the cache level it is fetched to. > Is the prefetch operation synchronous ? I mean, after calling it, are > we gauranteed that the variable is > indeed in the cache ? No, the variable definitely not guaranteed to be in the cache. This would not make any sense. The purpose of the prefetch is to fetch data in background while executing something else. Actually it is not guaranteed to fetch anything at all. The target cpu might not support the feature at all. Even if it does there are cases where it will not be prefetched, e.g. when it triggers a page fault. Also the cpu itself might decide not to do the prefetch, e.g. when the cache line is present (and locked by cache coherency) in the cache of a different cpu/core. > So this is probably for improving performance, assuming that you will > need this variable in the near > future. > The comment there says: > /* prefetch skb_end_pointer() to speedup skb_shinfo(skb) */ > > According to this logic, anywhere that we want to call skb_shinfo(skb) > we better do a prefetch before. > > In fact, if we prefetch any variable that we want to use then we end up > with performance boost. > > So - any hints, what are the guidlines for using prefetch()? You really should *not* prefetch() all variables you want to use. Prefetch itself generates code which needs cpu cycles. It can quickly make your program slower. Use it only in places where - the data is very unlikely to be in the cache of either the current or any other cpu in the system *and* - you can add the prefetch instruction at least 100ns before the actual use Also, if you access a reasonably large memory array sequentially (either forward or backward), you should not use prefetch() at all. The cpus have hardware prefetchers which are faster in this case. A general advise for performance optimisation: run benchmarks -Michi -- programing a layer 3+4 network protocol for mesh networks see http://michaelblizek.twilightparadox.com _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies