RE: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT

Biju Das <biju.das.jz@xxxxxxxxxxxxxx> · Tue, 20 Feb 2024 16:04:35 +0000

Hi Uwe,

> -----Original Message-----
> From: Biju Das
> Sent: Friday, February 9, 2024 1:39 PM
> To: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx>
> Cc: Thierry Reding <thierry.reding@xxxxxxxxx>; Philipp Zabel
> <p.zabel@xxxxxxxxxxxxxx>; Geert Uytterhoeven <geert+renesas@xxxxxxxxx>;
> Fabrizio Castro <fabrizio.castro.jz@xxxxxxxxxxx>; Magnus Damm
> <magnus.damm@xxxxxxxxx>; linux-pwm@xxxxxxxxxxxxxxx; linux-renesas-
> soc@xxxxxxxxxxxxxxx; Prabhakar Mahadev Lad <prabhakar.mahadev-
> lad.rj@xxxxxxxxxxxxxx>
> Subject: RE: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT
> 
> Hi Uwe,
> 
> Thanks for the feedback
> 
> > -----Original Message-----
> > From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx>
> > Sent: Wednesday, December 13, 2023 11:40 AM
> > Subject: Re: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT
> >
> > On Wed, Dec 13, 2023 at 09:06:56AM +0000, Biju Das wrote:
> > > Hi Uwe,
> > >
> > > > -----Original Message-----
> > > > From: Biju Das
> > > > Sent: Friday, December 8, 2023 2:12 PM
> > > > Subject: RE: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT
> > > >
> > > > Hi Uwe Kleine-König,
> > > >
> > > > > -----Original Message-----
> > > > > From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx>
> > > > > Sent: Friday, December 8, 2023 2:07 PM
> > > > > Subject: Re: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT
> > > > >
> > > > > Hello Biju,
> > > > >
> > > > > On Fri, Dec 08, 2023 at 10:34:55AM +0000, Biju Das wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx>
> > > > > > > Sent: Thursday, December 7, 2023 9:42 PM
> > > > > > > Subject: Re: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT
> > > > > > >
> > > > > > > Hello Biju,
> > > > > > >
> > > > > > > On Thu, Dec 07, 2023 at 06:26:44PM +0000, Biju Das wrote:
> > > > > > > > ######[  304.213944] pwm-rzg2l-gpt 10048000.pwm: .apply is
> > > > > > > > not idempotent (ena=1 pol=0 5500000000000/43980352512000)
> > > > > > > > ->
> > > > > > > > (ena=1
> > > > > > > > pol=0
> > > > > > > > 5500000000000/43980239923200)
> > > > > > > > 	 High setting##
> > > > > > > > 	[  304.230854] pwm-rzg2l-gpt 10048000.pwm: .apply is not
> > > > > > > > idempotent
> > > > > > > > (ena=1 pol=0 23980465100800/43980352512000) -> (ena=1
> > > > > > > > pol=0
> > > > > > > > 23980465100800/43980239923200)
> > > > > > >
> > > > > > > Have you tried to understand that? What is the clk rate when
> > > > > > > this
> > > > > happens?
> > > > > > > You're not suggesting that mul_u64_u64_div_u64 is wrong, are
> > you?
> > > > > >
> > > > > > mul_u64_u64_div_u64() works for certain values. But for very
> > > > > > high values we are losing precision and is giving unexpected
> > values.
> > > > >
> > > > > Can you reduce the problem to a bogus result of
> > mul_u64_u64_div_u64()?
> > > > > I'd be very surprised if the problem was mul_u64_u64_div_u64()
> > > > > and not how it's used in your pwm driver.
> > > >
> > > > When I looked last time, it drops precision here[1]. I will
> > > > recheck
> > again.
> > > > On RZ/G2L family devices, the PWM rate is 100MHz.
> > > >
> > >  [1]
> > > https://elixir.bootlin.com/linux/v6.7-rc4/source/lib/math/div64.c#L2
> > > 14
> > >
> > >
> > > Please find the bug details in mul_u64_u64_div_u64() compared to
> > > mul_u64_u32_div()
> > >
> > > Theoretical calculation:
> > >
> > > Period = 43980465100800 nsec
> > > Duty_cycle = 23980465100800 nsec
> > > PWM rate = 100MHz
> > >
> > > period_cycles(tmp) = 43980465100800 * (100 * 10 ^ 6) / (10 ^ 9) =
> > > 4398046510080 prescale = ((43980465100800 >> 32) >= 256) = 5
> > > period_cycles = min (round_up(4398046510080,( 1 << (2 * 5 )),
> > > U32_MAX) = min (4295162607, U32_MAX) = U32_MAX = 0xFFFFFFFF
> > > duty_cycles = min (2398046510080, ,( 1 << (2 * 5 )), U32_MAX) =  min
> > > (2341842295,
> > > U32_MAX) = 0x8B95AD77
> > >
> > >
> > > with mul_u64_u64_div_u64 (ARM64):
> > > [   54.551612] ##### period_cycles_norm=43980465100800
> > > [   54.305923] ##### period_cycles_tmp=4398035251080 ---> This is the
> > bug.
> >
> > It took me a while to read from your mail that
> >
> > 	mul_u64_u64_div_u64(43980465100800, 100000000, 1000000000)
> >
> > yields 4398035251080 on your machine (which isn't the exact result).
> >
> > I came to the same conclusion, damn, I thought mul_u64_u64_div_u64()
> > was exact. I wonder if it's worth to improve that. One fun fact is
> > that while mul_u64_u64_div_u64(43980465100800, 100000000, 1000000000)
> > yields
> > 4398035251080 (which is off by 11259000), swapping the parameters (and
> > thus using mul_u64_u64_div_u64(100000000, 43980465100800, 1000000000))
> > yields 4398046510080 which is the exact result.
> >
> > So this exact issue can be improved by:
> >
> > diff --git a/lib/math/div64.c b/lib/math/div64.c index
> > 55a81782e271..9523c3cd37f7 100644
> > --- a/lib/math/div64.c
> > +++ b/lib/math/div64.c
> > @@ -188,6 +188,9 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 c)
> >  	u64 res = 0, div, rem;
> >  	int shift;
> >
> > +	if (a > b)
> > +		return mul_u64_u64_div_u64(b, a, c);
> > +
> >  	/* can a * b overflow ? */
> >  	if (ilog2(a) + ilog2(b) > 62) {
> >  		/*
> >
> > but the issue stays in principle. I'll think about that for a while.
> 
> OK, I found a way to fix this issue
> 
> static inline u64 rzg2l_gpt_mul_u64_u64_div_u64_roundup(u64 a, u64 b, u64
> c) {
> 	u64 retval;
> 
> 	if (a > b)
> 		retval = mul_u64_u64_div_u64(b, a, c / 2);
> 	else
> 		retval = mul_u64_u64_div_u64(a, b, c / 2);
> 
> 	return DIV64_U64_ROUND_UP(retval, 2);
> }
> 
> In my case divisor is multiple of 2 as it is clk frequency.
> 
> a = 43980465100800, b= 100000000, c = 1000000000, expected result after
> rounding up = 4398046510080
> 
> with using above api,
> 
> 43980465100800 * 100000000 / 500000000 = 8796093020160. roundup
> (8796093020160, 2) = 4398046510080
> 
> I am planning to send v18 with these changes.
> 
> Please let me know if you have any comments.

I found another way to avoid overflow and also we are not losing any precision.

+	 * Rate is in MHz and is always integer for peripheral clk
+	 * 2^32(val) * 2^10 (prescalar) * 10^9 > 2^64
+	 * 2^32(val) * 2^10 (prescalar) * 10^6 < 2^64
+	 * Multiply val with prescalar first, if the result is less than
+	 * 2^34, then multiply by 10^9. Otherwise divide nr and dr by 10^3
+	 * so that it will never overflow.
 	 */

Here I can useDIV64_U64_ROUND_UP() instead. I will send v18 based on this.

Cheers,
Biju