RE: [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> writes:

> Michael Kelley <mikelley@xxxxxxxxxxxxx> writes:
>
>> I talked to KY Srinivasan for any history about TSC page on 32-bit.  He said
>> there was no technical reason not to implement it, but our focus was always
>> 64-bit Linux, so the 32-bit was much less important.  Also, on 32-bit Linux,
>> the required 64x64 multiply and shift is more complex and takes more
>> more cycles (compare 32-bit implementation of mul_u64_u64_shr vs.
>> the 64-bit implementation), so the win over a MSR read is less.  I
>> don't know of any actual measurements being made to compare vs.
>> MSR read.
>
> VMExit is 1000 CPU cycles or so, I would guess that TSC page
> calculations are better. Let me try to build 32bit kernel and do some
> quick measurements.

So I tried and the difference is HUGE.

For in-kernel clocksource reads (like sched_clock()), the testing code
was:

        before = rdtsc_ordered();
        for (i = 0; i < 1000; i++)
             (void)read_hv_sched_clock_msr();
        after = rdtsc_ordered();
        printk("MSR based clocksource: %d cycles\n", ((u32)(after - before))/1000);

        before = rdtsc_ordered();
        for (i = 0; i < 1000; i++)
            (void)read_hv_sched_clock_tsc();
        after = rdtsc_ordered();
        printk("TSC page clocksource: %d cycles\n", ((u32)(after - before))/1000);

The result (WS2016) is:
[    1.101910] MSR based clocksource: 3361 cycles
[    1.105224] TSC page clocksource: 49 cycles

For userspace reads the absolute difference is even bigger as TSC page
gives us functional vDSO:

Testing code:
	before = rdtsc();
	for (i = 0; i < COUNT; i++)
		clock_gettime(CLOCK_REALTIME, &tp);
	after = rdtsc();
	printf("%d\n", (after - before)/COUNT);

Result:

TSC page:
# ./gettime_cycles 
131

MSR:
# ./gettime_cycles 
5664

With all that I see no reason for us to not enable TSC page on 32bit,
even if the number of users is negligible, this will allow us to get rid
of ugly #ifdef CONFIG_HYPERV_TSCPAGE in the code.

I'll send a patch for discussion.

-- 
Vitaly



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux