Re: [PATCH v7] posix-timers: add clock_compare system call

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Thu, 14 Mar 2024 12:12:02 +0100

On Thu, Mar 14 2024 at 11:05, Sagi Maimon wrote:
> Some user space applications need to read a couple of different clocks.
> Each read requires moving from user space to kernel space.
> Reading each clock separately (syscall) introduces extra
> unpredictable/unmeasurable delay. Minimizing this delay contributes to user
> space actions on these clocks (e.g. synchronization etc).

I asked for a proper description of the actual problem several times now
and you still provide some handwaving blurb. Feel free to ignore me, but
then please don't be surprised if I ignore you too.

Also why does reading two random clocks make any sense at all? Your code
allows to read CLOCK_TAI and CLOCK_THREAD_CPUTIME_ID. What for?

This is about PTP, no?

Again you completely fail to explain why the existing PTP ioctls are not
sufficient for the purpose and why you need to provide a new interface
which is completely ill defined.

> arch/x86/entry/syscalls/syscall_64.tbl |   1 +
> drivers/ptp/ptp_clock.c                |  34 ++++--
> include/linux/posix-clock.h            |   2 +
> include/linux/syscalls.h               |   4 +
> include/uapi/asm-generic/unistd.h      |   5 +-
> kernel/time/posix-clock.c              |  25 +++++
> kernel/time/posix-timers.c             | 138 +++++++++++++++++++++++++
> kernel/time/posix-timers.h             |   2 +

This needs to be split up into:

     1) Infrastructure in posix-timers.c
     2) Wire up the syscall in x86
     3) Infrastructure in posix-clock.c
     4) Usage in ptp_clock.c

and not as a big lump of everything.

> +/**
> + * clock_compare - Get couple of clocks time stamps

So the name of the syscall suggest that it compares two clocks, but the
actual functionality is to read two clocks.

This does not make any sense at all. Naming matters.

> + * @clock_a:	clock a ID
> + * @clock_b:	clock b ID
> + * @tp_a:		Pointer to a user space timespec64 for clock a storage
> + * @tp_b:		Pointer to a user space timespec64 for clock b storage
> + *
> + * clock_compare gets time sample of two clocks.
> + * Supported clocks IDs: PHC, virtual PHC and various system clocks.
> + *
> + * In case of PHC that supports crosstimespec and the other clock is Monotonic raw
> + * or system time, crosstimespec will be used to synchronously capture
> + * system/device time stamp.
> + *
> + * In other cases: Read clock_a twice (before, and after reading clock_b) and
> + * average these times – to be as close as possible to the time we read clock_b.
> + *
> + * Returns:
> + *	0		Success. @tp_a and @tp_b contains the time stamps
> + *	-EINVAL		@clock a or b ID is not a valid clock ID
> + *	-EFAULT		Copying the time stamp to @tp_a or @tp_b faulted
> + *	-EOPNOTSUPP	Dynamic POSIX clock does not support crosstimespec()
> + **/
> +SYSCALL_DEFINE5(clock_compare, const clockid_t, clock_a, const clockid_t, clock_b,
> +		struct __kernel_timespec __user *, tp_a, struct __kernel_timespec __user *,
> +		tp_b, s64 __user *, offs_err)
> +{
> +	struct timespec64 ts_a, ts_a1, ts_b, ts_a2;
> +	struct system_device_crosststamp xtstamp_a1, xtstamp_a2, xtstamp_b;
> +	const struct k_clock *kc_a, *kc_b;
> +	ktime_t ktime_a;
> +	s64 ts_offs_err = 0;
> +	int error = 0;
> +	bool crosstime_support_a = false;
> +	bool crosstime_support_b = false;

Please read and follow the documentation provided at:

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html

> +	kc_a = clockid_to_kclock(clock_a);
> +	if (!kc_a) {
> +		error = -EINVAL;
> +		return error;

What's wrong about 'return -EINVAL;' ?

> +	}
> +
> +	kc_b = clockid_to_kclock(clock_b);
> +	if (!kc_b) {
> +		error = -EINVAL;
> +		return error;
> +	}
> +
> +	// In case crosstimespec supported and b clock is Monotonic raw or system
> +	// time, synchronously capture system/device time stamp

Please don't use C++ comments.

> +	if (clock_a < 0) {

This is just wrong. posix-clocks ar not the only ones which have a
negative clock id. See clockid_to_kclock()

> +		error = kc_a->clock_get_crosstimespec(clock_a, &xtstamp_a1);

What's a crosstimespec?

This function fills in a system_device_crosststamp, so why not name it
so that the purpose of the function is obvious?

ptp_clock::info::getcrosststamp() has a reasonable name. So why do you
need to come up with something which makes the code obfuscated?

> +		if (!error) {
> +			if (clock_b == CLOCK_MONOTONIC_RAW) {
> +				ts_b = ktime_to_timespec64(xtstamp_a1.sys_monoraw);
> +				ts_a1 = ktime_to_timespec64(xtstamp_a1.device);
> +				goto out;
> +			} else if (clock_b == CLOCK_REALTIME) {
> +				ts_b = ktime_to_timespec64(xtstamp_a1.sys_realtime);
> +				ts_a1 = ktime_to_timespec64(xtstamp_a1.device);
> +				goto out;
> +			} else {
> +				crosstime_support_a = true;

Right. If clock_b is anything else than CLOCK_MONOTONIC_RAW or
CLOCK_REALTIME then this is true.

> +			}
> +		}

So in case of an error, this just keeps going with an uninitialized
xtstamp_a1 and if the clock_b part succeeds it continues and operates on
garbage.

> +	}
> +
> +	// In case crosstimespec supported and a clock is Monotonic raw or system
> +	// time, synchronously capture system/device time stamp
> +	if (clock_b < 0) {
> +		// Synchronously capture system/device time stamp
> +		error = kc_b->clock_get_crosstimespec(clock_b, &xtstamp_b);
> +		if (!error) {
> +			if (clock_a == CLOCK_MONOTONIC_RAW) {
> +				ts_a1 = ktime_to_timespec64(xtstamp_b.sys_monoraw);
> +				ts_b = ktime_to_timespec64(xtstamp_b.device);
> +				goto out;
> +			} else if (clock_a == CLOCK_REALTIME) {
> +				ts_a1 = ktime_to_timespec64(xtstamp_b.sys_realtime);
> +				ts_b = ktime_to_timespec64(xtstamp_b.device);
> +				goto out;
> +			} else {
> +				crosstime_support_b = true;
> +			}
> +		}
> +	}
> +
> +	if (crosstime_support_a)
> +		error = kc_a->clock_get_crosstimespec(clock_a, &xtstamp_a1);

What? crosstime_support_a is only true when the exactly same call
returned success. Why does it need to be called here again?

> +	else
> +		error = kc_a->clock_get_timespec(clock_a, &ts_a1);
> +
> +	if (error)
> +		return error;
> +
> +	if (crosstime_support_b)
> +		error = kc_b->clock_get_crosstimespec(clock_b, &xtstamp_b);
> +	else
> +		error = kc_b->clock_get_timespec(clock_b, &ts_b);
> +
> +	if (error)
> +		return error;
> +
> +	if (crosstime_support_a)
> +		error = kc_a->clock_get_crosstimespec(clock_a, &xtstamp_a2);
> +	else
> +		error = kc_a->clock_get_timespec(clock_a, &ts_a2);
> +
> +	if (error)
> +		return error;

The logic and the code flow here are unreadable garbage and there are
zero comments what this is supposed to do.

> +	if (crosstime_support_a) {
> +		ktime_a = ktime_sub(xtstamp_a2.device, xtstamp_a1.device);
> +		ts_offs_err = ktime_divns(ktime_a, 2);
> +		ktime_a = ktime_add_ns(xtstamp_a1.device, (u64)ts_offs_err);
> +		ts_a1 = ktime_to_timespec64(ktime_a);

This is just wrong.

     read(a1);
     read(b);
     read(a2);

You _CANNOT_ assume that (a1 + ((a2 - a1) / 2) is anywhere close to the
point in time where 'b' is read. This code is preemtible and
interruptible. I explained this to you before.

Your explanation in the comment above the function is just wishful
thinking.

> + * In other cases: Read clock_a twice (before, and after reading clock_b) and
> + * average these times – to be as close as possible to the time we read clock_b.

Can you please sit down and provide a precise technical description of
the problem you are trying to solve and explain your proposed solution
at the conceptual level instead of throwing out random implementations
every few days?

Thanks,

        tglx