> -----Original Message----- > From: Zhang, Tianfei > Sent: Wednesday, March 15, 2023 10:59 AM > To: Marco Pagani <marpagan@xxxxxxxxxx> > Cc: ilpo.jarvinen@xxxxxxxxxxxxxxx; andriy.shevchenko@xxxxxxxxxxxxxxx; Weight, > Russell H <russell.h.weight@xxxxxxxxx>; matthew.gerlach@xxxxxxxxxxxxxxx; pierre- > louis.bossart@xxxxxxxxxxxxxxx; Gomes, Vinicius <vinicius.gomes@xxxxxxxxx>; > Khadatare, RaghavendraX Anand <raghavendrax.anand.khadatare@xxxxxxxxx>; > netdev@xxxxxxxxxxxxxxx; linux-fpga@xxxxxxxxxxxxxxx; richardcochran@xxxxxxxxx > Subject: RE: [PATCH v1] ptp: add ToD device driver for Intel FPGA cards > > > > > -----Original Message----- > > From: Marco Pagani <marpagan@xxxxxxxxxx> > > Sent: Tuesday, March 14, 2023 1:50 AM > > To: Zhang, Tianfei <tianfei.zhang@xxxxxxxxx> > > Cc: ilpo.jarvinen@xxxxxxxxxxxxxxx; andriy.shevchenko@xxxxxxxxxxxxxxx; > > Weight, Russell H <russell.h.weight@xxxxxxxxx>; > > matthew.gerlach@xxxxxxxxxxxxxxx; pierre- > > louis.bossart@xxxxxxxxxxxxxxx; Gomes, Vinicius > > <vinicius.gomes@xxxxxxxxx>; Khadatare, RaghavendraX Anand > > <raghavendrax.anand.khadatare@xxxxxxxxx>; > > netdev@xxxxxxxxxxxxxxx; linux-fpga@xxxxxxxxxxxxxxx; > > richardcochran@xxxxxxxxx > > Subject: Re: [PATCH v1] ptp: add ToD device driver for Intel FPGA > > cards > > > > > > > > On 2023-03-13 04:02, Tianfei Zhang wrote: > > > Adding a DFL (Device Feature List) device driver of ToD device for > > > Intel FPGA cards. > > > > > > The Intel FPGA Time of Day(ToD) IP within the FPGA DFL bus is > > > exposed as PTP Hardware clock(PHC) device to the Linux PTP stack to > > > synchronize the system clock to its ToD information using phc2sys > > > utility of the Linux PTP stack. The DFL is a hardware List within > > > FPGA, which defines a linked list of feature headers within the > > > device MMIO space to provide an extensible way of adding subdevice features. > > > > > > Signed-off-by: Raghavendra Khadatare > > > <raghavendrax.anand.khadatare@xxxxxxxxx> > > > Signed-off-by: Tianfei Zhang <tianfei.zhang@xxxxxxxxx> > > > --- > > > MAINTAINERS | 7 + > > > drivers/ptp/Kconfig | 13 ++ > > > drivers/ptp/Makefile | 1 + > > > drivers/ptp/ptp_dfl_tod.c | 334 > > > ++++++++++++++++++++++++++++++++++++++ > > > 4 files changed, 355 insertions(+) > > > create mode 100644 drivers/ptp/ptp_dfl_tod.c > > > > > > diff --git a/MAINTAINERS b/MAINTAINERS index > > > ec57c42ed544..584979abbd92 100644 > > > --- a/MAINTAINERS > > > +++ b/MAINTAINERS > > > @@ -15623,6 +15623,13 @@ L: netdev@xxxxxxxxxxxxxxx > > > S: Maintained > > > F: drivers/ptp/ptp_ocp.c > > > > > > +INTEL PTP DFL ToD DRIVER > > > +M: Tianfei Zhang <tianfei.zhang@xxxxxxxxx> > > > +L: linux-fpga@xxxxxxxxxxxxxxx > > > +L: netdev@xxxxxxxxxxxxxxx > > > +S: Maintained > > > +F: drivers/ptp/ptp_dfl_tod.c > > > + > > > OPENCORES I2C BUS DRIVER > > > M: Peter Korsgaard <peter@xxxxxxxxxxxxx> > > > M: Andrew Lunn <andrew@xxxxxxx> > > > diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig index > > > fe4971b65c64..e0d6f136ee46 100644 > > > --- a/drivers/ptp/Kconfig > > > +++ b/drivers/ptp/Kconfig > > > @@ -186,4 +186,17 @@ config PTP_1588_CLOCK_OCP > > > > > > More information is available at http://www.timingcard.com/ > > > > > > +config PTP_DFL_TOD > > > + tristate "FPGA DFL ToD Driver" > > > + depends on FPGA_DFL > > > + help > > > + The DFL (Device Feature List) device driver for the Intel ToD > > > + (Time-of-Day) device in FPGA card. The ToD IP within the FPGA > > > + is exposed as PTP Hardware Clock (PHC) device to the Linux PTP > > > + stack to synchronize the system clock to its ToD information > > > + using phc2sys utility of the Linux PTP stack. > > > + > > > + To compile this driver as a module, choose M here: the module > > > + will be called ptp_dfl_tod. > > > + > > > endmenu > > > diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile index > > > 28a6fe342d3e..553f18bf3c83 100644 > > > --- a/drivers/ptp/Makefile > > > +++ b/drivers/ptp/Makefile > > > @@ -18,3 +18,4 @@ obj-$(CONFIG_PTP_1588_CLOCK_IDTCM) += > > ptp_clockmatrix.o > > > obj-$(CONFIG_PTP_1588_CLOCK_IDT82P33) += ptp_idt82p33.o > > > obj-$(CONFIG_PTP_1588_CLOCK_VMW) += ptp_vmw.o > > > obj-$(CONFIG_PTP_1588_CLOCK_OCP) += ptp_ocp.o > > > +obj-$(CONFIG_PTP_DFL_TOD) += ptp_dfl_tod.o > > > diff --git a/drivers/ptp/ptp_dfl_tod.c b/drivers/ptp/ptp_dfl_tod.c > > > new file mode 100644 index 000000000000..d9fbdfc53bd8 > > > --- /dev/null > > > +++ b/drivers/ptp/ptp_dfl_tod.c > > > @@ -0,0 +1,334 @@ > > > +// SPDX-License-Identifier: GPL-2.0-only > > > +/* > > > + * DFL device driver for Time-of-Day (ToD) private feature > > > + * > > > + * Copyright (C) 2023 Intel Corporation */ > > > + > > > +#include <linux/bitfield.h> > > > +#include <linux/delay.h> > > > +#include <linux/dfl.h> > > > +#include <linux/gcd.h> > > > +#include <linux/iopoll.h> > > > +#include <linux/module.h> > > > +#include <linux/ptp_clock_kernel.h> #include <linux/spinlock.h> > > > +#include <linux/units.h> > > > + > > > +#define FME_FEATURE_ID_TOD 0x22 > > > + > > > +/* ToD clock register space. */ > > > +#define TOD_CLK_FREQ 0x038 > > > + > > > +/* > > > + * The read sequence of ToD timestamp registers: TOD_NANOSEC, > > > +TOD_SECONDSL and > > > + * TOD_SECONDSH, because there is a hardware snapshot whenever the > > > +TOD_NANOSEC > > > + * register is read. > > > + * > > > + * The ToD IP requires writing registers in the reverse order to the read > sequence. > > > + * The timestamp is corrected when the TOD_NANOSEC register is > > > +written, so the > > > + * sequence of write TOD registers: TOD_SECONDSH, TOD_SECONDSL and > > TOD_NANOSEC. > > > + */ > > > +#define TOD_SECONDSH 0x100 > > > +#define TOD_SECONDSL 0x104 > > > +#define TOD_NANOSEC 0x108 > > > +#define TOD_PERIOD 0x110 > > > +#define TOD_ADJUST_PERIOD 0x114 > > > +#define TOD_ADJUST_COUNT 0x118 > > > +#define TOD_DRIFT_ADJUST 0x11c > > > +#define TOD_DRIFT_ADJUST_RATE 0x120 > > > +#define PERIOD_FRAC_OFFSET 16 > > > +#define SECONDS_MSB GENMASK_ULL(47, 32) > > > +#define SECONDS_LSB GENMASK_ULL(31, 0) > > > +#define TOD_SECONDSH_SEC_MSB GENMASK_ULL(15, 0) > > > + > > > +#define CAL_SECONDS(m, l) ((FIELD_GET(TOD_SECONDSH_SEC_MSB, > > (m)) << 32) | (l)) > > > + > > > +#define TOD_PERIOD_MASK GENMASK_ULL(19, 0) > > > +#define TOD_PERIOD_MAX FIELD_MAX(TOD_PERIOD_MASK) > > > +#define TOD_PERIOD_MIN 0 > > > +#define TOD_DRIFT_ADJUST_MASK GENMASK_ULL(15, 0) > > > +#define TOD_DRIFT_ADJUST_FNS_MAX > > FIELD_MAX(TOD_DRIFT_ADJUST_MASK) > > > +#define TOD_DRIFT_ADJUST_RATE_MAX TOD_DRIFT_ADJUST_FNS_MAX > > > +#define TOD_ADJUST_COUNT_MASK GENMASK_ULL(19, 0) > > > +#define TOD_ADJUST_COUNT_MAX > > FIELD_MAX(TOD_ADJUST_COUNT_MASK) > > > +#define TOD_ADJUST_INTERVAL_US 1000 > > > +#define TOD_ADJUST_MS \ > > > + (((TOD_PERIOD_MAX >> 16) + 1) * (TOD_ADJUST_COUNT_MAX + 1)) > > > +#define TOD_ADJUST_MS_MAX (TOD_ADJUST_MS / MICRO) > > > +#define TOD_ADJUST_MAX_US (TOD_ADJUST_MS_MAX * > > USEC_PER_MSEC) > > > +#define TOD_MAX_ADJ (500 * MEGA) > > > + > > > +struct dfl_tod { > > > + struct ptp_clock_info ptp_clock_ops; > > > + struct device *dev; > > > + struct ptp_clock *ptp_clock; > > > + > > > + /* ToD Clock address space */ > > > + void __iomem *tod_ctrl; > > > + > > > + /* ToD clock registers protection */ > > > + spinlock_t tod_lock; > > > +}; > > > + > > > +/* > > > + * A fine ToD HW clock offset adjustment. To perform the fine > > > +offset adjustment, the > > > + * adjust_period and adjust_count argument are used to update the > > > +TOD_ADJUST_PERIOD > > > + * and TOD_ADJUST_COUNT register for in hardware. The dt->tod_lock > > > +spinlock must be > > > + * held when calling this function. > > > + */ > > > > Calling this function while holding the dt->tod_lock spinlock might > > cause an error since read_poll_timeout() can sleep. If it is required > > to use a spinlock, there is the > > readl_poll_timeout_atomic() function, which can be called from atomic context. > > However, in this case, it is probably better to use a mutex instead of > > a spinlock since the delay appears to be 1 ms. > > To program the TOD registers needs to be done without interruption to ensure the > correct values are programmed, So I think using readl_poll_timeout_atomic() > function here is better. This delay should be very faster, the 10 us interval delay will > be better. > > > > > > +static int fine_adjust_tod_clock(struct dfl_tod *dt, u32 adjust_period, > > > + u32 adjust_count) > > > +{ > > > + void __iomem *base = dt->tod_ctrl; > > > + u32 val; > > > + > > > + writel(adjust_period, base + TOD_ADJUST_PERIOD); > > > + writel(adjust_count, base + TOD_ADJUST_COUNT); > > > + > > > + /* Wait for present offset adjustment update to complete */ > > > + return readl_poll_timeout(base + TOD_ADJUST_COUNT, val, !val, > > TOD_ADJUST_INTERVAL_US, > > > + TOD_ADJUST_MAX_US); > > > +} > > > + > > > +/* > > > + * A coarse ToD HW clock offset adjustment. > > > + * The coarse time adjustment performs by adding or subtracting the > > > +delta value > > > + * from the current ToD HW clock time. > > > + */ > > > +static int coarse_adjust_tod_clock(struct dfl_tod *dt, s64 delta) { > > > + u32 seconds_msb, seconds_lsb, nanosec; > > > + void __iomem *base = dt->tod_ctrl; > > > + u64 seconds, now; > > > + > > > + if (delta == 0) > > > + return 0; > > > + > > > + nanosec = readl(base + TOD_NANOSEC); > > > + seconds_lsb = readl(base + TOD_SECONDSL); > > > + seconds_msb = readl(base + TOD_SECONDSH); > > > + > > > + /* Calculate new time */ > > > + seconds = CAL_SECONDS(seconds_msb, seconds_lsb); > > > + now = seconds * NSEC_PER_SEC + nanosec + delta; > > > + > > > + seconds = div_u64_rem(now, NSEC_PER_SEC, &nanosec); > > > + seconds_msb = FIELD_GET(SECONDS_MSB, seconds); > > > + seconds_lsb = FIELD_GET(SECONDS_LSB, seconds); > > > + > > > + writel(seconds_msb, base + TOD_SECONDSH); > > > + writel(seconds_lsb, base + TOD_SECONDSL); > > > + writel(nanosec, base + TOD_NANOSEC); > > > + > > > + return 0; > > > +} > > > + > > > +static int dfl_tod_adjust_fine(struct ptp_clock_info *ptp, long > > > +scaled_ppm) { > > > + struct dfl_tod *dt = container_of(ptp, struct dfl_tod, ptp_clock_ops); > > > + u32 tod_period, tod_rem, tod_drift_adjust_fns, tod_drift_adjust_rate; > > > + void __iomem *base = dt->tod_ctrl; > > > + unsigned long flags, rate; > > > + u64 ppb; > > > + > > > + /* Get the clock rate from clock frequency register offset */ > > > + rate = readl(base + TOD_CLK_FREQ); > > > + > > > + /* add GIGA as nominal ppb */ > > > + ppb = scaled_ppm_to_ppb(scaled_ppm) + GIGA; > > > + > > > + tod_period = div_u64_rem(ppb << PERIOD_FRAC_OFFSET, rate, &tod_rem); > > > + if (tod_period > TOD_PERIOD_MAX) > > > + return -ERANGE; > > > + > > > + /* > > > + * The drift of ToD adjusted periodically by adding a drift_adjust_fns > > > + * correction value every drift_adjust_rate count of clock cycles. > > > + */ > > > + tod_drift_adjust_fns = tod_rem / gcd(tod_rem, rate); > > > + tod_drift_adjust_rate = rate / gcd(tod_rem, rate); > > > + > > > + while ((tod_drift_adjust_fns > TOD_DRIFT_ADJUST_FNS_MAX) || > > > + (tod_drift_adjust_rate > TOD_DRIFT_ADJUST_RATE_MAX)) { > > > + tod_drift_adjust_fns >>= 1; > > > + tod_drift_adjust_rate >>= 1; > > > + } > > > > Why both tod_drift_adjust_fns and tod_drift_adjust_rate are > > iteratively divided if one of the two exceeds the maximum value? > > Wouldn't it be more accurate to set each of them to the max if they exceeded their > respective maximum value? > > Thanks your point out, calculate the tod_drift_adjust_fns and tod_drift_adjust_rate > separately is better. > Sorry, my last comment is not correct, I like to add more detail information about this code. This piece of code adjusts for the non-representable part of the fractional nanosecond part of the target ToD period that corresponds to the nominal or servo-steered frequency. For example, the clock period for a 10G application (156.25MHz) is 6.4ns. The hexadecimal representation of the integer part (6ns) is 0x6, and the fractional nanosecond part is 0.4 fns (fractional nanosecond), whose 16-bit hexadecimal representation is: Fractional nanosecond field is 16 bits wide: 0.4 fns = 0.4 * 2^16 = 26214.4 in decimal. Converting to hexadecimal: 26214 + 0.4 = 0x6666 + 0x0000.4 = 0x6666.4 fns. The 16bits register can represent 0x6666, but not 0.4fns. This would cause a systematic drift in the clock equal to 953.6 ns every 1 second. To compensate for that, the hardware ToD module uses two registers, namely TOD_DRIFT_ADJUST and TOD_DRIFT_ADJUST_RATE. The non-representable 0.4 fns per clock cycle equals to 2fns per 5 clock cycles, both of which are integer and representable. Hence is the TOD_DRIFT_ADJUST and TOD_DRIFT_ADJUST_RATE registers as follow: TOD_DRIFT_ADJUST = 0x02 TOD_DRIFT_ADJUST_RATE = 0x5. The while loop divides the higher-precision result of the divisions by 2 in sequential steps, until both tod_drift_adjust_fns and tod_drift_adjust_rate are representable within their hardware register precision and dynamic range. The while loop has a deterministic upper bound. Here is the ToD specification: https://www.intel.com/content/www/us/en/docs/programmable/683044/21-4/adjusting-tod-drift.html