Glynn Clements wrote: > In particular, branch instructions are dealt with by dedicated logic > circuitry which does nothing but process branch instructions. This > enables speculative execution to work handle branches even when the > calculation of the branch condition hasn't completed. Yep, I understand all that - it doesn't even have to be a particularly modern CPU, even IBMs S390 architecture did this in the early '90s. Speculative execution and branch prediction etc. have both been around for a while, just not in the Intel world. Regardless, I still wouldn't say "branches don't normally take any CPU cycles", but maybe that's splitting hairs. > The actual cost of a code cache miss varies depending upon the > relative speed of the CPU and RAM, but 400 cycles is typical. You > would need to have a lot of additional instructions before their cost > outweighs that of a cache miss. Very true, but given the size of L1 cache these days, you also have a lot more leeway than you did e.g. in the 90s. A typical memcmp() for short strings is unrolled by default by gcc. (at least I'm fairly certain it does that). /Per - To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html