On 6/6/23 01:23, David Laight wrote:
IIRC the x86 performance counters aren't dependent on anything so they tend to execute much earlier than you want. OTOH rdtsc is likely to be synchronising and affect what follows. ISTR using rdtsc to wait for instructions to complete and then the performance clock counter to see how long it took.
RDPMC and RDTSC have the same (lack of) synchronization guarantees; you need to fence them appropriately for your application no matter what.
-hpa