Upcoming changes will do performance measurements of instructions. Since stck is designed to return unique values even on concurrent calls, it is unsuited for performance measurements. stckf should be used in this case. While touching that code, also add a missing cc clobber in get_clock_us() and avoid the memory clobber by moving the clock value to the output operands. Hence, add a nice wrapper for stckf to the time library. Signed-off-by: Nico Boehr <nrb@xxxxxxxxxxxxx> --- lib/s390x/asm/time.h | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/lib/s390x/asm/time.h b/lib/s390x/asm/time.h index 7652a151e87a..8d2327a40541 100644 --- a/lib/s390x/asm/time.h +++ b/lib/s390x/asm/time.h @@ -14,11 +14,20 @@ #define STCK_SHIFT_US (63 - 51) #define STCK_MAX ((1UL << 52) - 1) +static inline uint64_t get_clock_fast(void) +{ + uint64_t clk; + + asm volatile(" stckf %0 " : "=Q"(clk) : : "cc"); + + return clk; +} + static inline uint64_t get_clock_us(void) { uint64_t clk; - asm volatile(" stck %0 " : : "Q"(clk) : "memory"); + asm volatile(" stck %0 " : "=Q"(clk) : : "cc"); return clk >> STCK_SHIFT_US; } -- 2.36.1