On Tue, Mar 12, 2024, at 13:15, Sagi Maimon wrote: > On Tue, Mar 12, 2024 at 1:19 PM Arnd Bergmann <arnd@xxxxxxxx> wrote: >> On Tue, Mar 12, 2024, at 10:50, Sagi Maimon wrote: >> > + kc_a = clockid_to_kclock(clock_a); >> > + if (!kc_a) { >> > + error = -EINVAL; >> > + return error; >> > + } >> > + >> > + kc_b = clockid_to_kclock(clock_b); >> > + if (!kc_b) { >> > + error = -EINVAL; >> > + return error; >> > + } >> >> I'm not sure if we really need to have it generic enough to >> support any combination of clocks here. It complicates the >> implementation a bit but it also generalizes the user space >> side of it. >> >> Can you think of cases where you want to compare against >> something other than CLOCK_MONOTONIC_RAW or CLOCK_REALTIME, >> or are these going to be the ones that you expect to >> be used anyway? >> > sure, one example is syncing two different PHCs (which was originally > why we needed this syscall) > I hope that I have understand your note and that answers your question. Right, that is clearly a sensible use case. I'm still trying to understand the implementation for the case where you have two different PHCs and both implement clock_get_crosstimespec(). Rather than averaging between two snapshots here, I would expect this to result in something like ktime_a1 += xtstamp_b.sys_monoraw - xtstamp_a1.sys_monoraw; in order get two device timestamps ktime_a1 and ktime_b that reflect the snapshots as if they were taken simulatenously. Am I missing some finer detail here, or is this something you should do? >> > + if (crosstime_support_a) { >> > + ktime_a1 = xtstamp_a1.device; >> > + ktime_a2 = xtstamp_a2.device; >> > + } else { >> > + ktime_a1 = timespec64_to_ktime(ts_a1); >> > + ktime_a2 = timespec64_to_ktime(ts_a2); >> > + } >> > + >> > + ktime_a = ktime_add(ktime_a1, ktime_a2); >> > + >> > + ts_offs = ktime_divns(ktime_a, 2); >> > + >> > + ts_a1 = ns_to_timespec64(ts_offs); >> >> Converting nanoseconds to timespec64 is rather expensive, >> so I wonder if this could be changed to something cheaper, >> either by returning nanoseconds in the end and consistently >> working on those, or by doing the calculation on the >> timespec64 itself. >> > I prefer returning timespec64, so this system call aligns with other > system calls like clock_gettime for example. > As far as doing the calculation on timespec64 itself, that looks more > expansive to me, but I might be wrong. In the general case, dividing a 64-bit variable by some other variable is really expensive and will take hundreds of cycles. This one is a bit cheaper because the division is done using a constant divider of NS_PER_SEC, which can get optimized fairly well on many systems by turning it into an equivalent 128-bit multiplication plus shift. For the case where you start out with a timespec64, I would expect it to be cheaper to calculate the nanosecond difference between ts_a1 and ts_a2 to add half of that to the timespec than to average two large 64-bit values and convert that back to a timespec afterwards. This should be fairly easy to try out if you can test a 32-bit kernel. We could decide that there is no need to care about anything bug 64-bit kernels here, in which case your current version should be just as good for both the crosstime_support_a and !crosstime_support_a cases. Arnd