Thanks! KW On Fri, Feb 15, 2013 at 6:42 PM, <michi1@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > Hi! > > On 12:16 Fri 15 Feb , Kevin Wilson wrote: > ... >> AFAIK, what prefetch does is get a variable from memory and put it in >> cache (L2 cache I believe). > > Yes, this is true. See: > http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Other-Builtins.html > I am not so sure about the cache level it is fetched to. > >> Is the prefetch operation synchronous ? I mean, after calling it, are >> we gauranteed that the variable is >> indeed in the cache ? > > No, the variable definitely not guaranteed to be in the cache. This would not > make any sense. The purpose of the prefetch is to fetch data in background > while executing something else. > > Actually it is not guaranteed to fetch anything at all. The target cpu might > not support the feature at all. Even if it does there are cases where it will > not be prefetched, e.g. when it triggers a page fault. Also the cpu itself > might decide not to do the prefetch, e.g. when the cache line is present (and > locked by cache coherency) in the cache of a different cpu/core. > >> So this is probably for improving performance, assuming that you will >> need this variable in the near >> future. >> The comment there says: >> /* prefetch skb_end_pointer() to speedup skb_shinfo(skb) */ >> >> According to this logic, anywhere that we want to call skb_shinfo(skb) >> we better do a prefetch before. >> >> In fact, if we prefetch any variable that we want to use then we end up >> with performance boost. >> >> So - any hints, what are the guidlines for using prefetch()? > > You really should *not* prefetch() all variables you want to use. Prefetch > itself generates code which needs cpu cycles. It can quickly make your program > slower. Use it only in places where > - the data is very unlikely to be in the cache of either the current or any > other cpu in the system *and* > - you can add the prefetch instruction at least 100ns before the actual use > > Also, if you access a reasonably large memory array sequentially (either > forward or backward), you should not use prefetch() at all. The cpus have > hardware prefetchers which are faster in this case. > > > A general advise for performance optimisation: run benchmarks > > -Michi > -- > programing a layer 3+4 network protocol for mesh networks > see http://michaelblizek.twilightparadox.com > > _______________________________________________ > Kernelnewbies mailing list > Kernelnewbies@xxxxxxxxxxxxxxxxx > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies