Re: [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v4]

Jason Ekstrand <jason@xxxxxxxxxxxxxx> · Tue, 16 Oct 2018 14:44:44 -0500

On Tue, Oct 16, 2018 at 2:35 PM Keith Packard <keithp@xxxxxxxxxx> wrote:
Bas Nieuwenhuizen <bas@xxxxxxxxxxxxxxxxxxx> writes:

>> +       end = radv_clock_gettime(CLOCK_MONOTONIC_RAW);

>> +

>> +       uint64_t clock_period = end - begin;

>> +       uint64_t device_period = DIV_ROUND_UP(1000000, clock_crystal_freq);

>> +

>> +       *pMaxDeviation = MAX2(clock_period, device_period);

>

> Should this not be a sum? Those deviations can happen independently

> from each other, so worst case both deviations happen in the same

> direction which causes the magnitude to be combined.

This use of MAX2 comes right from one of the issues raised during work

on the extension:

 8) Can the maximum deviation reported ever be zero?

 RESOLVED: Unless the tick of each clock corresponding to the set of

 time domains coincides and all clocks can literally be sampled

 simutaneously, there isn’t really a possibility for the maximum

 deviation to be zero, so by convention the maximum deviation is always

 at least the maximum of the length of the ticks of the set of time

 domains calibrated and thus can never be zero.

I can't wrap my brain around this entirely, but I think that this says

that the deviation reported is supposed to only reflect the fact that we

aren't sampling the clocks at the same time, and so there may be a

'tick' of error for any sampled clock.

If you look at the previous issue in the spec, that actually has the

pseudo code I used in this implementation for computing maxDeviation

which doesn't include anything about the time period of the GPU.

Jason suggested using the GPU period as the minimum value for

maxDeviation at the GPU time period to make sure we never accidentally

returned zero, as that is forbidden by the spec. We might be able to use

1 instead, but it won't matter in practice as the time it takes to

actually sample all of the clocks is far longer than a GPU tick.

I think what Bas is getting at is that there are two problems:

 1) We are not sampling at exactly the same time
 2) The two clocks may not tick at exactly the same time.

Even if I can simultaneously sample the CPU and GPU clocks, their oscilators are not aligned and I my sample may be at the begining of the CPU tick and the end of the GPU tick.  If I had sampled 75ns earlier, I could have gotten lower CPU time but the same GPU time (most intel GPUs have about an 80ns tick).

If we want to be conservative, I suspect Bas may be right that adding is the safer thing to do.

--Jason

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel