btw., here's the cost analysis of cr2 reading and writing (in a tight loop). I've executed cr2 read+write instructions 1 billion times on a Nehalem box: static long cr2_test(void) { unsigned long tmp = 0; int i; for (i = 0; i < 1000000000; i++) asm("movq %0, %%cr2; movq %%cr2, %0" : : "r" (tmp)); return 0; } Which gave these overall stats: Performance counter stats for './prctl 0 0': 28414.696319 task-clock-msecs # 0.997 CPUs 3 context-switches # 0.000 M/sec 1 CPU-migrations # 0.000 M/sec 149 page-faults # 0.000 M/sec 87254432334 cycles # 3070.750 M/sec 5078691161 instructions # 0.058 IPC 304144 cache-references # 0.011 M/sec 28760 cache-misses # 0.001 M/sec 28.501962853 seconds time elapsed. 87254432334/1000000000 ~== 87, so we have 87 cycles cost per iteration. The annotated output shows: aldebaran:~> perf annotate sys_prctl | grep -A 2 cr2 0.42 : ffffffff81053131: 0f 22 d1 mov %rcx,%cr2 96.56 : ffffffff81053134: 0f 20 d1 mov %cr2,%rcx 3.02 : ffffffff81053137: ff c0 inc %eax 0.00 : ffffffff81053139: 39 d0 cmp %edx,%eax the read/write cost ratio is 3%:96.5% (with skidding taken into account), that suggests that the reading cost of cr2 is about 2-3 cycles, the writing cost is about 85 cycles. Which makes sense - reading cr2 is in the pagefault critical path, so that's optimized. Writing it is allowed but not optimized at all. (especially in such a tight loop where it could easily have some back-to-back additional latency that would not be there in an NMI handler save/restore path which has other instructions inbetween.) Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html