Hi Uwe, > -----Original Message----- > From: Biju Das > Sent: Friday, February 9, 2024 1:39 PM > To: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> > Cc: Thierry Reding <thierry.reding@xxxxxxxxx>; Philipp Zabel > <p.zabel@xxxxxxxxxxxxxx>; Geert Uytterhoeven <geert+renesas@xxxxxxxxx>; > Fabrizio Castro <fabrizio.castro.jz@xxxxxxxxxxx>; Magnus Damm > <magnus.damm@xxxxxxxxx>; linux-pwm@xxxxxxxxxxxxxxx; linux-renesas- > soc@xxxxxxxxxxxxxxx; Prabhakar Mahadev Lad <prabhakar.mahadev- > lad.rj@xxxxxxxxxxxxxx> > Subject: RE: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT > > Hi Uwe, > > Thanks for the feedback > > > -----Original Message----- > > From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> > > Sent: Wednesday, December 13, 2023 11:40 AM > > Subject: Re: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT > > > > On Wed, Dec 13, 2023 at 09:06:56AM +0000, Biju Das wrote: > > > Hi Uwe, > > > > > > > -----Original Message----- > > > > From: Biju Das > > > > Sent: Friday, December 8, 2023 2:12 PM > > > > Subject: RE: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT > > > > > > > > Hi Uwe Kleine-König, > > > > > > > > > -----Original Message----- > > > > > From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> > > > > > Sent: Friday, December 8, 2023 2:07 PM > > > > > Subject: Re: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT > > > > > > > > > > Hello Biju, > > > > > > > > > > On Fri, Dec 08, 2023 at 10:34:55AM +0000, Biju Das wrote: > > > > > > > -----Original Message----- > > > > > > > From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> > > > > > > > Sent: Thursday, December 7, 2023 9:42 PM > > > > > > > Subject: Re: [PATCH v17 3/4] pwm: Add support for RZ/G2L GPT > > > > > > > > > > > > > > Hello Biju, > > > > > > > > > > > > > > On Thu, Dec 07, 2023 at 06:26:44PM +0000, Biju Das wrote: > > > > > > > > ######[ 304.213944] pwm-rzg2l-gpt 10048000.pwm: .apply is > > > > > > > > not idempotent (ena=1 pol=0 5500000000000/43980352512000) > > > > > > > > -> > > > > > > > > (ena=1 > > > > > > > > pol=0 > > > > > > > > 5500000000000/43980239923200) > > > > > > > > High setting## > > > > > > > > [ 304.230854] pwm-rzg2l-gpt 10048000.pwm: .apply is not > > > > > > > > idempotent > > > > > > > > (ena=1 pol=0 23980465100800/43980352512000) -> (ena=1 > > > > > > > > pol=0 > > > > > > > > 23980465100800/43980239923200) > > > > > > > > > > > > > > Have you tried to understand that? What is the clk rate when > > > > > > > this > > > > > happens? > > > > > > > You're not suggesting that mul_u64_u64_div_u64 is wrong, are > > you? > > > > > > > > > > > > mul_u64_u64_div_u64() works for certain values. But for very > > > > > > high values we are losing precision and is giving unexpected > > values. > > > > > > > > > > Can you reduce the problem to a bogus result of > > mul_u64_u64_div_u64()? > > > > > I'd be very surprised if the problem was mul_u64_u64_div_u64() > > > > > and not how it's used in your pwm driver. > > > > > > > > When I looked last time, it drops precision here[1]. I will > > > > recheck > > again. > > > > On RZ/G2L family devices, the PWM rate is 100MHz. > > > > > > > [1] > > > https://elixir.bootlin.com/linux/v6.7-rc4/source/lib/math/div64.c#L2 > > > 14 > > > > > > > > > Please find the bug details in mul_u64_u64_div_u64() compared to > > > mul_u64_u32_div() > > > > > > Theoretical calculation: > > > > > > Period = 43980465100800 nsec > > > Duty_cycle = 23980465100800 nsec > > > PWM rate = 100MHz > > > > > > period_cycles(tmp) = 43980465100800 * (100 * 10 ^ 6) / (10 ^ 9) = > > > 4398046510080 prescale = ((43980465100800 >> 32) >= 256) = 5 > > > period_cycles = min (round_up(4398046510080,( 1 << (2 * 5 )), > > > U32_MAX) = min (4295162607, U32_MAX) = U32_MAX = 0xFFFFFFFF > > > duty_cycles = min (2398046510080, ,( 1 << (2 * 5 )), U32_MAX) = min > > > (2341842295, > > > U32_MAX) = 0x8B95AD77 > > > > > > > > > with mul_u64_u64_div_u64 (ARM64): > > > [ 54.551612] ##### period_cycles_norm=43980465100800 > > > [ 54.305923] ##### period_cycles_tmp=4398035251080 ---> This is the > > bug. > > > > It took me a while to read from your mail that > > > > mul_u64_u64_div_u64(43980465100800, 100000000, 1000000000) > > > > yields 4398035251080 on your machine (which isn't the exact result). > > > > I came to the same conclusion, damn, I thought mul_u64_u64_div_u64() > > was exact. I wonder if it's worth to improve that. One fun fact is > > that while mul_u64_u64_div_u64(43980465100800, 100000000, 1000000000) > > yields > > 4398035251080 (which is off by 11259000), swapping the parameters (and > > thus using mul_u64_u64_div_u64(100000000, 43980465100800, 1000000000)) > > yields 4398046510080 which is the exact result. > > > > So this exact issue can be improved by: > > > > diff --git a/lib/math/div64.c b/lib/math/div64.c index > > 55a81782e271..9523c3cd37f7 100644 > > --- a/lib/math/div64.c > > +++ b/lib/math/div64.c > > @@ -188,6 +188,9 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 c) > > u64 res = 0, div, rem; > > int shift; > > > > + if (a > b) > > + return mul_u64_u64_div_u64(b, a, c); > > + > > /* can a * b overflow ? */ > > if (ilog2(a) + ilog2(b) > 62) { > > /* > > > > but the issue stays in principle. I'll think about that for a while. > > OK, I found a way to fix this issue > > static inline u64 rzg2l_gpt_mul_u64_u64_div_u64_roundup(u64 a, u64 b, u64 > c) { > u64 retval; > > if (a > b) > retval = mul_u64_u64_div_u64(b, a, c / 2); > else > retval = mul_u64_u64_div_u64(a, b, c / 2); > > return DIV64_U64_ROUND_UP(retval, 2); > } > > In my case divisor is multiple of 2 as it is clk frequency. > > a = 43980465100800, b= 100000000, c = 1000000000, expected result after > rounding up = 4398046510080 > > with using above api, > > 43980465100800 * 100000000 / 500000000 = 8796093020160. roundup > (8796093020160, 2) = 4398046510080 > > I am planning to send v18 with these changes. > > Please let me know if you have any comments. I found another way to avoid overflow and also we are not losing any precision. + * Rate is in MHz and is always integer for peripheral clk + * 2^32(val) * 2^10 (prescalar) * 10^9 > 2^64 + * 2^32(val) * 2^10 (prescalar) * 10^6 < 2^64 + * Multiply val with prescalar first, if the result is less than + * 2^34, then multiply by 10^9. Otherwise divide nr and dr by 10^3 + * so that it will never overflow. */ Here I can useDIV64_U64_ROUND_UP() instead. I will send v18 based on this. Cheers, Biju