Hi, Mulyadi Santosa wrote: >>From the C code you have written to illustrate likely() and the result > of objdump -S, I got conclusion that the optimization is done by > arranging the code (the assembler output of we-do-something-if-true) > directly following the "cmp" and "jne" instruction. This way, the > processor will execute the next instruction faster because it is > already prefetched at L1 cache and "jump" is avoided since it is a bit > costly . I am not sure on what you mean by "pipeline", so I guess you > mean CPU pipeline. Please CMIIW and I am sorry if I bring confusion > here. I'm not a CPU-expert, but I think it can optimize in two ways: - by improving the locality of the code path, it allows to stay in the same I-cache line, see http://en.wikipedia.org/wiki/CPU_cache ; - by improving the sequentiality of the code path, it allows to avoid CPU pipeline flush, see http://en.wikipedia.org/wiki/Instruction_pipeline. Sincerly, Thomas -- Thomas Petazzoni thomas.petazzoni@xxxxxxxx -- Kernelnewbies: Help each other learn about the Linux kernel. Archive: http://mail.nl.linux.org/kernelnewbies/ FAQ: http://kernelnewbies.org/faq/