Hi Uwe, Thanks for your detailed review and comments. Please find my comments below. On Wed, Jan 19 2022, Uwe Kleine-König wrote: > On Tue, Dec 14, 2021 at 06:27:17PM +0200, Baruch Siach wrote: >> From: Baruch Siach <baruch.siach@xxxxxxxxx> >> >> Driver for the PWM block in Qualcomm IPQ6018 line of SoCs. Based on >> driver from downstream Codeaurora kernel tree. Removed support for older >> (V1) variants because I have no access to that hardware. >> >> Tested on IPQ6010 based hardware. >> >> Signed-off-by: Baruch Siach <baruch.siach@xxxxxxxxx> >> --- >> v10: >> >> Restore round up in pwm_div calculation; otherwise diff is always <= >> 0, so only bingo match works >> >> Don't overwrite min_diff on every loop iteration >> >> v9: >> >> Address comment from Uwe Kleine-König: >> >> Use period_ns*rate in dividers calculation for better accuracy >> >> Round down pre_div and pwm_div >> >> Add a comment explaining why pwm_div can't underflow >> >> Add a comment explaining why pre_div > pwm_div end the search loop >> >> Drop 'CFG_' from register macros >> >> Rename to_ipq_pwm_chip() to ipq_pwm_from_chip() >> >> Change bare 'unsigned' to 'unsigned int' >> >> Clarify the comment on separate REG1 write for enable/disable >> >> Round up the period value in .get_state >> >> Use direct readl/writel so no need to check for regmap errors >> >> v7: >> >> Change 'offset' to 'reg' for the tcsr offset (Rob) >> >> Drop clock name; there is only one clock (Bjorn) >> >> Simplify probe failure code path (Bjorn) >> >> v6: >> >> Address Uwe Kleine-König review comments: >> >> Drop IPQ_PWM_MAX_DEVICES >> >> Rely on assigned-clock-rates; drop IPQ_PWM_CLK_SRC_FREQ >> >> Simplify register offset calculation >> >> Calculate duty cycle more precisely >> >> Refuse to set inverted polarity >> >> Drop redundant IPQ_PWM_REG1_ENABLE bit clear >> >> Remove x1000 factor in pwm_div calculation, use rate directly, and round up >> >> Choose initial pre_div such that pwm_div < IPQ_PWM_MAX_DIV >> >> Ensure pre_div <= pwm_div >> >> Rename close_ to best_ >> >> Explain in comment why effective_div doesn't overflow >> >> Limit pwm_div to IPQ_PWM_MAX_DIV - 1 to allow 100% duty cycle >> >> Disable clock only after pwmchip_remove() >> >> const pwm_ops >> >> Other changes: >> >> Add missing linux/bitfield.h header include (kernel test robot) >> >> Adjust code for PWM device node under TCSR (Rob Herring) >> >> v5: >> >> Use &tcsr_q6 syscon to access registers (Bjorn Andersson) >> >> Address Uwe Kleine-König review comments: >> >> Implement .get_state() >> >> Add IPQ_PWM_ prefix to local macros >> >> Use GENMASK/BIT/FIELD_PREP for register fields access >> >> Make type of config_div_and_duty() parameters consistent >> >> Derive IPQ_PWM_MIN_PERIOD_NS from IPQ_PWM_CLK_SRC_FREQ >> >> Integrate enable/disable into config_div_and_duty() to save register read, >> and reduce frequency glitch on update >> >> Use min() instead of min_t() >> >> Fix comment format >> >> Use dev_err_probe() to indicate probe step failure >> >> Add missing clk_disable_unprepare() in .remove >> >> Don't set .owner >> >> v4: >> >> Use div64_u64() to fix link for 32-bit targets ((kernel test robot >> <lkp@xxxxxxxxx>, Uwe Kleine-König) >> >> v3: >> >> s/qcom,pwm-ipq6018/qcom,ipq6018-pwm/ (Rob Herring) >> >> Fix integer overflow on 32-bit targets (kernel test robot <lkp@xxxxxxxxx>) >> >> v2: >> >> Address Uwe Kleine-König review comments: >> >> Fix period calculation when out of range >> >> Don't set period larger than requested >> >> Remove PWM disable on configuration change >> >> Implement .apply instead of non-atomic .config/.enable/.disable >> >> Don't modify PWM on .request/.free >> >> Check pwm_div underflow >> >> Fix various code and comment formatting issues >> >> Other changes: >> >> Use u64 divisor safe division >> >> Remove now empty .request/.free >> --- >> drivers/pwm/Kconfig | 12 ++ >> drivers/pwm/Makefile | 1 + >> drivers/pwm/pwm-ipq.c | 275 ++++++++++++++++++++++++++++++++++++++++++ >> 3 files changed, 288 insertions(+) >> create mode 100644 drivers/pwm/pwm-ipq.c >> >> diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig >> index 21e3b05a5153..e39718137ecd 100644 >> --- a/drivers/pwm/Kconfig >> +++ b/drivers/pwm/Kconfig >> @@ -260,6 +260,18 @@ config PWM_INTEL_LGM >> To compile this driver as a module, choose M here: the module >> will be called pwm-intel-lgm. >> >> +config PWM_IPQ >> + tristate "IPQ PWM support" >> + depends on ARCH_QCOM || COMPILE_TEST >> + depends on HAVE_CLK && HAS_IOMEM >> + help >> + Generic PWM framework driver for IPQ PWM block which supports >> + 4 pwm channels. Each of the these channels can be configured >> + independent of each other. >> + >> + To compile this driver as a module, choose M here: the module >> + will be called pwm-ipq. >> + >> config PWM_IQS620A >> tristate "Azoteq IQS620A PWM support" >> depends on MFD_IQS62X || COMPILE_TEST >> diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile >> index 708840b7fba8..7402feae4b36 100644 >> --- a/drivers/pwm/Makefile >> +++ b/drivers/pwm/Makefile >> @@ -22,6 +22,7 @@ obj-$(CONFIG_PWM_IMX1) += pwm-imx1.o >> obj-$(CONFIG_PWM_IMX27) += pwm-imx27.o >> obj-$(CONFIG_PWM_IMX_TPM) += pwm-imx-tpm.o >> obj-$(CONFIG_PWM_INTEL_LGM) += pwm-intel-lgm.o >> +obj-$(CONFIG_PWM_IPQ) += pwm-ipq.o >> obj-$(CONFIG_PWM_IQS620A) += pwm-iqs620a.o >> obj-$(CONFIG_PWM_JZ4740) += pwm-jz4740.o >> obj-$(CONFIG_PWM_KEEMBAY) += pwm-keembay.o >> diff --git a/drivers/pwm/pwm-ipq.c b/drivers/pwm/pwm-ipq.c >> new file mode 100644 >> index 000000000000..3764010808f0 >> --- /dev/null >> +++ b/drivers/pwm/pwm-ipq.c >> @@ -0,0 +1,275 @@ >> +// SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0 >> +/* >> + * Copyright (c) 2016-2017, 2020 The Linux Foundation. All rights reserved. >> + */ >> + >> +#include <linux/module.h> >> +#include <linux/platform_device.h> >> +#include <linux/pwm.h> >> +#include <linux/clk.h> >> +#include <linux/io.h> >> +#include <linux/of.h> >> +#include <linux/math64.h> >> +#include <linux/of_device.h> >> +#include <linux/bitfield.h> >> + >> +/* The frequency range supported is 1 Hz to clock rate */ >> +#define IPQ_PWM_MAX_PERIOD_NS ((u64)NSEC_PER_SEC) >> + >> +/* >> + * The max value specified for each field is based on the number of bits >> + * in the pwm control register for that field >> + */ >> +#define IPQ_PWM_MAX_DIV 0xFFFF >> + >> +/* >> + * Two 32-bit registers for each PWM: REG0, and REG1. >> + * Base offset for PWM #i is at 8 * #i. >> + */ >> +#define IPQ_PWM_REG0 0 /*PWM_DIV PWM_HI*/ >> +#define IPQ_PWM_REG0_PWM_DIV GENMASK(15, 0) >> +#define IPQ_PWM_REG0_HI_DURATION GENMASK(31, 16) > > PWM_HI in the comment of IPQ_PWM_REG0 vs. HI_DURATION? Should this > match? I'd say the comment is redundant. > >> +#define IPQ_PWM_REG1 4 /*ENABLE UPDATE PWM_PRE_DIV*/ >> +#define IPQ_PWM_REG1_PRE_DIV GENMASK(15, 0) >> +/* >> + * Enable bit is set to enable output toggling in pwm device. >> + * Update bit is set to reflect the changed divider and high duration >> + * values in register. >> + */ >> +#define IPQ_PWM_REG1_UPDATE BIT(30) >> +#define IPQ_PWM_REG1_ENABLE BIT(31) >> + >> + >> +struct ipq_pwm_chip { >> + struct pwm_chip chip; >> + struct clk *clk; >> + void __iomem *mem; >> +}; >> + >> +static struct ipq_pwm_chip *ipq_pwm_from_chip(struct pwm_chip *chip) >> +{ >> + return container_of(chip, struct ipq_pwm_chip, chip); >> +} >> + >> +static unsigned int ipq_pwm_reg_read(struct pwm_device *pwm, unsigned int reg) >> +{ >> + struct ipq_pwm_chip *ipq_chip = ipq_pwm_from_chip(pwm->chip); >> + unsigned int off = 8 * pwm->hwpwm + reg; >> + >> + return readl(ipq_chip->mem + off); >> +} >> + >> +static void ipq_pwm_reg_write(struct pwm_device *pwm, unsigned int reg, >> + unsigned int val) >> +{ >> + struct ipq_pwm_chip *ipq_chip = ipq_pwm_from_chip(pwm->chip); >> + unsigned int off = 8 * pwm->hwpwm + reg; >> + >> + writel(val, ipq_chip->mem + off); >> +} >> + >> +static void config_div_and_duty(struct pwm_device *pwm, unsigned int pre_div, >> + unsigned int pwm_div, unsigned long rate, u64 duty_ns, >> + bool enable) >> +{ >> + unsigned long hi_dur; >> + unsigned long val = 0; >> + >> + /* >> + * high duration = pwm duty * (pwm div + 1) >> + * pwm duty = duty_ns / period_ns >> + */ >> + hi_dur = div64_u64(duty_ns * rate, (pre_div + 1) * NSEC_PER_SEC); >> + >> + val = FIELD_PREP(IPQ_PWM_REG0_HI_DURATION, hi_dur) | >> + FIELD_PREP(IPQ_PWM_REG0_PWM_DIV, pwm_div); >> + ipq_pwm_reg_write(pwm, IPQ_PWM_REG0, val); >> + >> + val = FIELD_PREP(IPQ_PWM_REG1_PRE_DIV, pre_div); >> + ipq_pwm_reg_write(pwm, IPQ_PWM_REG1, val); >> + >> + /* PWM enable toggle needs a separate write to REG1 */ >> + val |= IPQ_PWM_REG1_UPDATE; >> + if (enable) >> + val |= IPQ_PWM_REG1_ENABLE; >> + ipq_pwm_reg_write(pwm, IPQ_PWM_REG1, val); >> +} >> + >> +static int ipq_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm, >> + const struct pwm_state *state) >> +{ >> + struct ipq_pwm_chip *ipq_chip = ipq_pwm_from_chip(chip); >> + unsigned int pre_div, pwm_div, best_pre_div, best_pwm_div; >> + unsigned long rate = clk_get_rate(ipq_chip->clk); >> + u64 period_ns, duty_ns, period_rate; >> + u64 min_diff; >> + >> + if (state->polarity != PWM_POLARITY_NORMAL) >> + return -EINVAL; >> + >> + if (state->period < div64_u64(NSEC_PER_SEC, rate)) >> + return -ERANGE; > > NSEC_PER_SEC / rate is the smallest period you can achieve, right? > Consider rate = 33333 (Hz), then the minimal period is > 30000.30000300003 ns. So you should refuse a request to configure > state->period = 30000, but as div64_u64(1000000000, 33333) is 30000 you > don't. > >> + period_ns = min(state->period, IPQ_PWM_MAX_PERIOD_NS); >> + duty_ns = min(state->duty_cycle, period_ns); >> + >> + /* >> + * period_ns is 1G or less. As long as rate is less than 16 GHz this >> + * does not overflow. > > Well, rate cannot be bigger than 4294967295 because an unsigned long > cannot hold a bigger value. On 64-bit systems __SIZEOF_LONG__ is 8, which can hold more than 2^32. >> + */ >> + period_rate = period_ns * rate; >> + best_pre_div = IPQ_PWM_MAX_DIV; >> + best_pwm_div = IPQ_PWM_MAX_DIV; >> + /* Initial pre_div value such that pwm_div < IPQ_PWM_MAX_DIV */ Note this comment. >> + pre_div = div64_u64(period_rate, >> + (u64)NSEC_PER_SEC * (IPQ_PWM_MAX_DIV + 1)); > > Hmmm, we want > > (pre_div + 1) * (pwm_div + 1) * NSEC_PER_SEC > -------------------------------------------- <= period_ns > rate > > , right? Resolving that for pre_div this gives: > > period_ns * rate > pre_div <= ---------------------------- > NSEC_PER_SEC * (pwm_div + 1) > > The term on the right hand side is maximal for pwm_div == 0 so the > possible values for pre_div are > > 0 ... min(div64_u64(period_rate / NSEC_PER_SEC), IPQ_PWM_MAX_DIV) > > isn't it? I don't think so. pre_div == 0 will produce pwm_div larger than IPQ_PWM_MAX_DIV for a large period_rate value. The initial pre_div here is the smallest value that produces pwm_div within it limit. > If so, your algorithm is wrong as you're iterating over > > div64_u64(period_rate, NSEC_PER_SEC * (IPQ_PWM_MAX_DIV + 1)) ... IPQ_PWM_MAX_DIV The loop stops when pre_div > pwm_div. That should be before pre_div == IPQ_PWM_MAX_DIV because pwm_div <= IPQ_PWM_MAX_DIV. Should I put the pre_div > pwm_div condition directly in the for statement? >> + min_diff = period_rate; >> + >> + for (; pre_div <= IPQ_PWM_MAX_DIV; pre_div++) { >> + long long diff; >> + >> + pwm_div = DIV64_U64_ROUND_UP(period_rate, >> + (u64)NSEC_PER_SEC * (pre_div + 1)); >> + /* pwm_div is unsigned; the check below catches underflow */ >> + pwm_div--; > > What underflow? DIV64_U64_ROUND_UP returns > 0 assuming period_rate > 0. > So pwm_div - 1 doesn't underflow?! I'll update the comment. > The task here is to calculate the biggest pwm_div for a given pre_div > such that > > > (pre_div + 1) * (pwm_div + 1) * NSEC_PER_SEC > -------------------------------------------- <= period_ns > rate > > right? > > This is equivalent to: > > period_ns * rate > pre_div <= ---------------------------- - 1 > (pre_div + 1) * NSEC_PER_SEC > > As pre_div is integer, rounding down should be fine?! I can't follow. With round down (as in v8) the result is always: NSEC_PER_SEC * (pre_div + 1) * (pwm_div + 1) <= period_rate As a result, 'diff' calculation below will always produce diff <= 0. When there is no diff == 0 result (bingo) we get IPQ_PWM_MAX_DIV in both best_ values at the end of the loop. Do we actually need diff > 0 in the condition below? >> + /* >> + * pre_div and pwm_div values swap produces the same >> + * result. This loop goes over all pre_div <= pwm_div >> + * combinations. The rest are equivalent. >> + */ > > I'd write: > > /* > * Swapping values for pre_div and pwm_div produces the same > * period length. So we can skip all settings with pre_div < > * pwm_div which results in bigger constraints for selecting the > * duty_cycle than with the two values swapped. > */ I'll take your wording with inverted inequality sign. Thanks, baruch >> + if (pre_div > pwm_div) >> + break; >> + >> + /* >> + * Make sure we can do 100% duty cycle where >> + * hi_dur == pwm_div + 1 >> + */ >> + if (pwm_div > IPQ_PWM_MAX_DIV - 1) >> + continue; >> + >> + diff = ((uint64_t)NSEC_PER_SEC * (pre_div + 1) * (pwm_div + 1)) >> + - period_rate; >> + >> + if (diff < 0) /* period larger than requested */ >> + continue; > > This shouldn't happen if the above calculation is correct. > >> + if (diff == 0) { /* bingo */ >> + best_pre_div = pre_div; >> + best_pwm_div = pwm_div; >> + break; >> + } >> + if (diff < min_diff) { >> + min_diff = diff; >> + best_pre_div = pre_div; >> + best_pwm_div = pwm_div; >> + } > > This can be simplified as: > > if (diff < min_diff) { > best_pre_div = pre_div; > best_pwm_div = pwm_div; > min_diff = diff; > > if (min_diff == 0) > /* bingo! */ > break; > } > >> + } >> + >> + /* config divider values for the closest possible frequency */ >> + config_div_and_duty(pwm, best_pre_div, best_pwm_div, >> + rate, duty_ns, state->enabled); >> + >> + return 0; >> +} >> + >> +static void ipq_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm, >> + struct pwm_state *state) >> +{ >> + struct ipq_pwm_chip *ipq_chip = ipq_pwm_from_chip(chip); >> + unsigned long rate = clk_get_rate(ipq_chip->clk); >> + unsigned int pre_div, pwm_div, hi_dur; >> + u64 effective_div, hi_div; >> + u32 reg0, reg1; >> + >> + reg0 = ipq_pwm_reg_read(pwm, IPQ_PWM_REG0); >> + reg1 = ipq_pwm_reg_read(pwm, IPQ_PWM_REG1); >> + >> + state->polarity = PWM_POLARITY_NORMAL; >> + state->enabled = reg1 & IPQ_PWM_REG1_ENABLE; >> + >> + pwm_div = FIELD_GET(IPQ_PWM_REG0_PWM_DIV, reg0); >> + hi_dur = FIELD_GET(IPQ_PWM_REG0_HI_DURATION, reg0); >> + pre_div = FIELD_GET(IPQ_PWM_REG1_PRE_DIV, reg1); >> + >> + /* No overflow here, both pre_div and pwm_div <= 0xffff */ >> + effective_div = (u64)(pre_div + 1) * (pwm_div + 1); >> + state->period = DIV64_U64_ROUND_UP(effective_div * NSEC_PER_SEC, rate); >> + >> + hi_div = hi_dur * (pre_div + 1); >> + state->duty_cycle = div64_u64(hi_div * NSEC_PER_SEC, rate); > > This must be round up for the same reasons as for period. > >> +} > > Best regards > Uwe -- ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@xxxxxxxxxx - tel: +972.52.368.4656, http://www.tkos.co.il -