Re: [PATCH] serial: 8250: Tolerate clock variance for max baud rate

Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> · Tue, 03 Mar 2015 14:08:45 -0500

On 03/03/2015 05:11 AM, James Hogan wrote:
> Hi Peter,
> 
> On Mon, Mar 02, 2015 at 01:37:07PM -0500, Peter Hurley wrote:
>> [ +cc common clock maintainers ]
>>
>> On 02/26/2015 12:34 PM, James Hogan wrote:
>>> Hi Peter,
>>>
>>> On 26/02/15 16:54, Peter Hurley wrote:
>>>> On 02/26/2015 09:44 AM, James Hogan wrote:
>>>>> On 26/02/15 13:45, Peter Hurley wrote:
>>>>>> Hi James,
>>>>>>
>>>>>> On 02/26/2015 06:25 AM, James Hogan wrote:
>>>>>>> Hi Sebastian,
>>>>>>>
>>>>>>> On 26/02/15 08:51, Sebastian Andrzej Siewior wrote:
>>>>>>>> On 02/26/2015 01:21 AM, James Hogan wrote:
>>>>>>>>> When the UART clock is set slightly under 1.8432MHz, the 8250 driver
>>>>>>>>> core doesn't permit the 115200 baud rate since it calculates the maximum
>>>>>>>>> frequency to pass to uart_get_baud_rate by simply dividing the uart
>>>>>>>>> clock by 16 which yields a value slightly under 115200, even though the
>>>>>>>>> frequency is close enough for the UART to operate reliably.
>>>>>>>>>
>>>>>>>>> Therefore add some tolerance in the calculation of the maximum baud
>>>>>>>>> rate, specifically +1/128th (~0.8%), to allow the 115200 baud rate to
>>>>>>>>> be closen for a UART clock as low as 1.8149MHz. For an external divider
>>>>>>>>> set as close as possible to 1.8432MHz, this should cover every possible
>>>>>>>>> input frequency above 118.9MHz.
>>>>>>>>>
>>>>>>>>> Signed-off-by: James Hogan <james.hogan@xxxxxxxxxx>
>>>>>>>>> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
>>>>>>>>> Cc: Jiri Slaby <jslaby@xxxxxxx>
>>>>>>>>> Cc: Anton Vorontsov <avorontsov@xxxxxxxxxxxxx>
>>>>>>>>> Cc: Peter Hurley <peter@xxxxxxxxxxxxxxxxxx>
>>>>>>>>> Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
>>>>>>>>> Cc: linux-serial@xxxxxxxxxxxxxxx
>>>>>>>>> ---
>>>>>>>>> As far as I can tell from reading the link below, this tolerance should
>>>>>>>>> be okay, and it definitely covers the range of input frequencies I
>>>>>>>>> expect for my particular hardware. I'm open to better ways of handling
>>>>>>>>> it though.
>>>>>>>>
>>>>>>>> What is your hardware?
>>>>>>>
>>>>>>> SoC is TZ1090 (metag based). UART clock is best derived from system
>>>>>>> clock which would normally be somewhere between 180MHz and 240MHz, using
>>>>>>> an integer divider.
>>>>>>>
>>>>>>>> (1814900 + 1814900 / 128) / 16
>>>>>>>>         114317
>>>>>>>> It is not 115200 but close enough to get you the 115200 baud rate. It
>>>>>>>> might make sense to increase the tolerance rather than increasing the
>>>>>>>> the clock but it is up to Peter / Alan to decide.
>>>>>>>
>>>>>>> One random example configuration I have here is:
>>>>>>>
>>>>>>> # cat /sys/kernel/debug/clk/clk_summary
>>>>>>>    clock                         enable_cnt  prepare_cnt        rate
>>>>>>> --------------------------------------------------------------------
>>>>>>>  xtal1                                    2            2    24576000
>>>>>>>     sys_sw                                1            1    24576000
>>>>>>>        sys_pll                            1            1   719965090
>>>>>>>           sys_div                         1            1   719965090
>>>>>>>              sys_x2_undeleted             1            1   719965090
>>>>>>>                 sys_undeleted             2            2   179991273
>>>>>>>                    uart_sw                1            1   179991273
>>>>>>>                       uart_en             1            1   179991273
>>>>>>>                          uart             1            1     1836646
>>>>>>>
>>>>>>> sys_undeleted  divide     uart    baud          error
>>>>>>>     179991273      97  1855580  115973   774 =  0.67%
>>>>>>>     179991273      98  1836646  114790  -410 = -0.36% (chosen)
>>>>>>>
>>>>>>> At this sys_undeleted frequency, the difference between two consecutive
>>>>>>> divider values is about 1183 Hz.
>>>>>>> For round to closest, the max error is about 591.5 Hz = 0.513%.
>>>>>>> For round up (i.e. increasing the clock), the max error is about 1183 Hz
>>>>>>> = 1.03%.
>>>>>>>
>>>>>>> So my reasoning is that rounding to closest and fixing 8250 driver to
>>>>>>> accept it and getting as accurate a baud rate as possible is better than
>>>>>>> having a less accurate baud rate and having to add a new rounding mode
>>>>>>> to common clock muxes and dividers.
>>>>>>>
>>>>>>> Note that AFAICT for faster input clocks the 8250 driver would accept a
>>>>>>> 115200 baud rate regardless of how inaccurate the resulting baud rate
>>>>>>> would be after division, so arguably the tolerance could be much higher.
>>>>>>>
>>>>>>> The serial core also calculates the divider with DIV_ROUND_CLOSEST, so
>>>>>>> there is precedent for using that rounding mode right up through the
>>>>>>> clock tree.
>>>>>>
>>>>>> Why clock the uart so low?
>>>>>
>>>>> I tried forcing uart = sys_undeleted / 1 a couple of years ago when
>>>>> switching to common clock & DT, relying instead on the divider internal
>>>>> to the UART, but it didn't work for everybody. It was presumed at the
>>>>> time that the UART input was effectively being overclocked between the
>>>>> divider and the uart so wasn't reliable at higher frequencies.
>>>>
>>>> 180 Mhz uart clock :)
>>>>
>>>>> The hardware manual lists Fmax/ideal as 25MHz, but I didn't want to set
>>>>> it to a specific value such as 25MHz as the combination of the 2
>>>>> dividers (the one in the uart block and the external one in the SoC)
>>>>> will reduce the precision that can be attained.
>>>>>
>>>>> So that really only leaves the option of aiming for the right uart
>>>>> frequency with the external divider and using an internal divide by 1.
>>>>
>>>> Dividing down the sys clock by 14 and the uart clock by 7 yields an
>>>> identical baud rate of 114790 (as does dividing the sys clock by 7
>>>> but that sets the uart clock to 25.713 Mhz which may be too high for
>>>> the IP block).
>>>>
>>>> Also, dividing sys clock by 49 and uart clock by 2 = 114790
>>>
>>> Sure, but it doesn't generalise very easily. The system/core/DDR clocks
>>> are statically configured by the bootloader depending on a whole bunch
>>> of board specifics (particularly for the DDR). I.e. it can be virtually
>>> anything (including prime number multiples of 115200*16). Linux then
>>> configures the rest of the clocks as it needs, using the common clock
>>> framework. This would require adding knowledge of max intermediate clock
>>> rates to common clock,
>>
>> The common clock framework will need to support this anyway, because a
>> 2-level clock tree can't possibly work with UART, ADC, GPU, etc. off a
>> single common clock without independent intermediate stages (maybe for
>> your limited application it does, but it doesn't generalize to other
>> designs).
>>
>>> and adding the UART internal divider into the decision making,
>>
>> Constrained dividers are already supported by many clock drivers.
>>
>>> i.e. the UART driver trying to set a bunch of rates,
>>> multiples of 115200*16 for each internal divider value, and seeing which
>>> gives the best baud rate, which would be even more complicated, and its
>>> all to appease an arguably nonsensical constraint in the driver.
>>
>> The true constraint you're trying to workaround is that the common clock
>> framework doesn't adequately support your use-case yet. But that's where
>> your requirement needs to be solved.
>>
>> ISTM that the clock api should provide helpers so the intermediate parent
>> clock can compute the best-fit clock rate given the constraints of
>> the input sys clock rate, the {min,max} intermediate clock rate range,
>> and the desired sample rate multiple of the clock consumer.
> 
> Even with those other constraints fully implemented, is it really a
> correct constraint that the uart input clock must be strictly >=
> 115200*16? A value under this may give the closest baud rate even taking
> all other constraints into account.

I spent some time last night experimenting with a simple brute-force
solver.

And while there are input clock rates for which the best-fit output clock is
< 115200 * 16 (for that target rate or its multiples), for the same input
clock rate there always seems to be faster multiples within some degree
of error, although not as low as you initially proposed.

For example, an input clock = 152785600 has a best-fit output clock =
1840790 (div=83, baud=115049, error=0.13%). That same input clock rate
has fastest multiples with div=82 and div=41 with a error=1.3%

However, the error limit you initially proposed (as you noted in the
commit log) is only suitable for input clock rates above 118Mhz; lower
clock rates would require larger error limits.

I plan to experiment further on the actual effects of a slow uart clock
and a suitable error limit, but any further information you can provide
wrt sample error might be helpful.

Regards,
Peter Hurley

>>> At the moment I just need this patch and the following added in the uart
>>> DT node, and it should basically work for all system clock rates I care
>>> about:
>>>
>>> 	assigned-clocks = <&top_clks CLK_TOP_UART>;
>>> 	assigned-clock-rates = <1843200>;
>>>
>>> Cheers
>>> James
>>>
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html