Re: [PATCH v3] posix-timers: add multi_clock_gettime system call

"Arnd Bergmann" <arnd@xxxxxxxx> · Tue, 02 Jan 2024 12:29:59 +0100

On Sun, Dec 31, 2023, at 17:00, Sagi Maimon wrote:
> On Fri, Dec 29, 2023 at 5:27 PM Arnd Bergmann <arnd@xxxxxxxx> wrote:

>> > +struct __ptp_multi_clock_get {
>> > +     unsigned int n_clocks; /* Desired number of clocks. */
>> > +     unsigned int n_samples; /* Desired number of measurements per clock. */
>> > +     clockid_t clkid_arr[MULTI_PTP_MAX_CLOCKS]; /* list of clock IDs */
>> > +     /*
>> > +      * Array of list of n_clocks clocks time samples n_samples times.
>> > +      */
>> > +     struct  __kernel_timespec ts[MULTI_PTP_MAX_SAMPLES][MULTI_PTP_MAX_CLOCKS];
>> > +};
>>
>> The fixed size arrays here seem to be an unnecessary limitation,
>> both MULTI_PTP_MAX_SAMPLES and MULTI_PTP_MAX_CLOCKS are small
>> enough that one can come up with scenarios where you would want
>> a higher number, but at the same time the structure is already
>> 808 bytes long, which is more than you'd normally want to put
>> on the kernel stack, and which may take a significant time to
>> copy to and from userspace.
>>
>> Since n_clocks and n_samples are always inputs to the syscall,
>> you can just pass them as register arguments and use a dynamically
>> sized array instead.
>>
> Both MULTI_PTP_MAX_SAMPLES and MULTI_PTP_MAX_CLOCKS are enough of any
> usage we can think of,
> But I think you are right, it is better to use a dynamically sized
> array for future use, plus to use less stack memory.
> On patch v4 a dynamically sized array will be used .
> I leaving both MULTI_PTP_MAX_SAMPLES and MULTI_PTP_MAX_CLOCKS but
> increasing their values, since there should be some limitation.

I think having an implementation specific limit in the kernel is
fine, but it would be nice to hardcode that limit in the API.

If both clkidarr[] and ts[] are passed as pointer arguments
in registers, they can be arbitrarily long in the API and
still have a documented maximum that we can extend in the
future without changing the interface.

>> It's not clear to me what you gain from having the n_samples
>> argument over just calling the syscall repeatedly. Does
>> this offer a benefit for accuracy or is this just meant to
>> avoid syscall overhead.
> It is mainly to avoid syscall overhead which also slightly
> improve the accuracy.

This is not a big deal as far as I'm concerned, but it
would be nice to back this up with some numbers if you
think it's worthwhile, as my impression is that the effect
is barely measurable: my guess would be that the syscall
overhead is always much less than the cost for the hardware
access.

>> On the other hand, this will still give less accuracy than the
>> getcrosststamp() callback with ioctl(PTP_SYS_OFFSET_PRECISE),
>> so either the last bit of accuracy isn't all that important,
>> or you need to refine the interface to actually be an
>> improvement over the chardev.
>>
> I don't understand this comment, please explain.
> The ioctl(PTP_SYS_OFFSET_PRECISE) is one specific case that can be
> done by multi_clock_gettime syscall (which cover many more cases)
> Plus the ioctl(PTP_SYS_OFFSET_PRECISE) works only on drivers that
> support this feature.

My point here is that on drivers that do support
PTP_SYS_OFFSET_PRECISE, the extra accuracy should be maintained
by the new interface, ideally in a way that does not have any
other downsides.

I think Andy's suggestion of exposing time offsets instead
of absolute times would actually achieve that: If the
interface is changed to return the offset against
CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW or CLOCK_BOOTTIME
(not sure what is best here), then the new syscall can use
getcrosststamp() where supported for the best results or
fall back to gettimex64() or gettime64() otherwise to
provide a consistent user interface.

Returning an offset would also allow easily calculating an
average over multiple calls in the kernel, instead of
returning a two-dimensional array.

    Arnd