Hi, On Wed, Jun 12, 2024 at 11:56 PM Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> wrote: > > On Wed, 12 Jun 2024, Doug Anderson wrote: > > On Wed, Jun 12, 2024 at 12:38 AM Ilpo Järvinen > > <ilpo.jarvinen@xxxxxxxxxxxxxxx> wrote: > > > > > > On Mon, 10 Jun 2024, Douglas Anderson wrote: > > > > > > > The current uart_fifo_timeout() returns jiffies, which is not always > > > > the most convenient for callers. Add a variant uart_fifo_timeout_ms() > > > > that returns the timeout in milliseconds. > > > > > > > > NOTES: > > > > - msecs_to_jiffies() rounds up, unlike nsecs_to_jiffies(). This is > > > > because msecs_to_jiffies() is actually intended for device drivers > > > > to calculate timeout value. This means we don't need to take the max > > > > of the timeout and "1" since the timeout will always be > 0 ms (we > > > > add 20 ms of slop). > > > > - uart_fifo_timeout_ms() returns "unsigned int" but we leave > > > > uart_fifo_timeout() returning "unsigned long". This matches the > > > > types of msecs_to_jiffies(). > > > > > > > > Suggested-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> > > > > Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx> > > > > --- > > > > > > > > Changes in v4: > > > > - New > > > > > > > > include/linux/serial_core.h | 15 +++++++++++---- > > > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h > > > > index 8cb65f50e830..97968acfd564 100644 > > > > --- a/include/linux/serial_core.h > > > > +++ b/include/linux/serial_core.h > > > > @@ -889,14 +889,21 @@ unsigned int uart_get_divisor(struct uart_port *port, unsigned int baud); > > > > /* > > > > * Calculates FIFO drain time. > > > > */ > > > > -static inline unsigned long uart_fifo_timeout(struct uart_port *port) > > > > +static inline unsigned int uart_fifo_timeout_ms(struct uart_port *port) > > > > { > > > > u64 fifo_timeout = (u64)READ_ONCE(port->frame_time) * port->fifosize; > > > > + unsigned int fifo_timeout_ms = div_u64(fifo_timeout, NSEC_PER_MSEC); > > > > > > > > - /* Add .02 seconds of slop */ > > > > - fifo_timeout += 20 * NSEC_PER_MSEC; > > > > + /* > > > > + * Add .02 seconds of slop. This also helps account for the fact that > > > > + * when we converted from ns to ms that we didn't round up. > > > > + */ > > > > + return fifo_timeout_ms + 20; > > > > +} > > > > > > > > - return max(nsecs_to_jiffies(fifo_timeout), 1UL); > > > > +static inline unsigned long uart_fifo_timeout(struct uart_port *port) > > > > +{ > > > > + return msecs_to_jiffies(uart_fifo_timeout_ms(port)); > > > > } > > > > > > Hi, > > > > > > This is definitely towards the right direction! However, it now does > > > double conversion, first div_u64() and then msecs_to_jiffies(). Perhaps it > > > would be better to retain the nsecs version (maybe rename it to _ns for > > > consistency) and add _ms variant that does the nsec -> msec conversion. > > > > I spent a bit of time thinking about it and I don't agree. If you feel > > very strongly about it or someone else wants to jump in and break the > > tie then I can look again, but: > > > > 1. The comment before nsecs_to_jiffies() specifically states that it's > > not supposed to be used for this purpose. Specifically, it says: > > > > * Unlike {m,u}secs_to_jiffies, type of input is not unsigned int but u64. > > * And this doesn't return MAX_JIFFY_OFFSET since this function is designed > > * for scheduler, not for use in device drivers to calculate timeout value. > > > > ...so switching away from nsecs_to_jiffies() to msecs_to_jiffies() is > > arguably a "bugfix", or at least avoids using the API in a way that's > > against the documentation. > > Okay, I see. However, there's no way around using u64 here even with your > version that does not use nsecs_to_jiffies() because nsecs is the most > useful form of input when starting from frame_time, usecs is a bit > coarse-grained for higher data rates. Right. We have to start with u64 because the frame time is in ns and we can only fit ~4 seconds worth of ns in 32-bits. That seems iffy. > > 2. As mentioned in the commit message, nsecs_to_jiffies() truncates > > where msecs_to_jiffies() rounds up. Presumably this difference is > > related to the comment above where the "ns" version is intended for > > the scheduler. Using the "ms" version allows us to get rid of the > > extra call to "max()" which is a benefit. Technically since the > > timeout is at least 20 ms the minimum HZ is 100 I guess we didn't need > > the max anyway, but I guess someone thought it was cleaner and now we > > can definitely get rid of it. > > > > 3. These functions are inline anyway, so I don't think it's causing a > > huge bloat of instructions. In fact, moving from 64-bit math to 32-bit > > math sooner could make the code smaller. > > > > 4. I don't feel like it hurts the readability to convert down to ms > > and then to jiffies. In fact, IMO it helps since it makes it more > > obvious that we're working with ms. > > I'd be lying if I'd say I feel strongly about it Fair enough. If someone wants to throw in an opinion and tiebreak then they can. > but my only argument > involves doing an extra divide which is somewhat costly. It's a > plain 32-bit divide though so not as bad as the u64 one that is > unavoidable. We shouldn't be calling this in a loop anyway, so it's unlikely to matter. In any case, I'd note that with the old code we had: 1. 64-bit multiply (time * fifosize) 2. 64-bit addition (result + 20ms) 3. 64-bit => 32-bit division (to jiffies) 4. 32-bit comparison against the value 1. 5. Conditional setting of the value to 1. Now we have: 1. 64-bit multiply (time * fifosize) 2. 64-bit => 32-bit division (to ms) 3. 32-bit addition with a small immediate (20) 4. 32-bit addition (div round up) if HZ != 1000 5. 32-bit division (div round up) if HZ != 1000 I didn't try disassembling to see what the compiler did and it would be different for each compiler / ISA / optimization level / value of HZ, but I guess my point is that while we have one more divide (unless HZ == 1000) we may have one less conditional. We're also tending to do our math with small immediates which some ISAs can handle more efficiently. I think the real answer, though, is that this doesn't really matter and that we should pick the solution that's cleaner/easier to understand. I'm still in favor of the patch as it is. As I said, if folks feel really strongly then it doesn't matter and I can change it, but otherwise I'd rather keep it the way it is. -Doug