Re: [PATCH] tty: serial: qcom-geni-serial: minor fixes to get_clk_div_rate()

Doug Anderson <dianders@xxxxxxxxxxxx> · Wed, 8 Jun 2022 15:37:38 -0700

Hi,

On Wed, Jun 8, 2022 at 11:34 AM Vijaya Krishna Nivarthi
<quic_vnivarth@xxxxxxxxxxx> wrote:
>
> Hi,
>
>
> On 6/8/2022 12:55 AM, Doug Anderson wrote:
> > Hi,
> >
> > On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi
> > <quic_vnivarth@xxxxxxxxxxx> wrote:
> >> Hi,
> >>
> >> On 6/7/2022 1:29 AM, Doug Anderson wrote:
> >>
> >> My only concern continues to be...
> >>
> >> Given ser_clk is the final frequency that this function is going to
> >> return and best_div is going to be the clk_divider, is it ok if the
> >> divider cant divide the frequency exactly?
> >>
> >> In other words, Can this function output combinations like (402,4)
> >> (501,5) ?
> >>
> >> If ok, then we can go ahead with this patch or even previous perhaps.
> > I don't see why not. You're basically just getting a resulting clock
> > that's not an integral "Hz", right?
> >
> > So if "baud" is 9600 and sampling_rate is 16 then desired_clk is (9600
> > * 16) = 153600
> >
> > Let's imagine that we do all the math and we finally decide that our
> > best bet is with the rate 922000 and a divider of 6. That means that
> > the actual clock we'll make is 153666.67 when we _wanted_ 153600.
> > There's no reason it needs to be integral, though, and 153666.67 would
> > still be better than making 160000.
> >
> Thank you for clarification.
> >>> power?)
> >> Actually power saving was the anticipation behind returning first
> >> frequency in original patch, when we cant find exact frequency.
> > Right, except that if you just pick the first clock you find it would
> > be _wildly_ off. I guess if you really want to do this the right way,
> > you need to set a maximum tolerance and pick the first rate you find
> > that meets that tolerance. Random web search for "uart baud rate
> > tolerance" makes me believe that +/- 5% deviation is OK, but to be
> > safe you probably want something lower. Maybe 2%? So if the desired
> > clock is within 2% of a clock you can make, can you just pick that
> > one?
> Ok, 2% seems good.
> >
> >>>> Please note that we go past cases when we have an divider that can
> >>>> exactly divide the frequency(105/1, 204/2, 303/3) and end up with one
> >>>> that doesn't.
> >>> Ah, good point. Luckily that's a 1-line fix, right?
> >> Apologies, I could not figure out how.
> > Ah, sorry. Not quite 1 line, but this (untested)
> >
> >
> > freq = clk_round_rate(clk, mult);
> >
> > if (freq % desired_clk == 0) {
> >   ser_clk = freq;
> >   best_div = freq / desired_clk;
> >   break;
> > }
> >
> > candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk));
> > candidate_freq = freq / candidate_div;
> > diff = abs((long)desired_clk - candidate_freq);
> > if (diff < best_diff) {
> >    best_diff = diff;
> >    ser_clk = freq;
> >    best_div = candidate_div;
> > }
>
> But then once again, we would likely need 2 loops because while we are
> ok with giving up on search for best_div on finding something within 2%
> tolerance, we may not want to give up on exact match (freq % desired_clk
> == 0 )

Ah, it took me a while to understand why two loops. It's because in
one case you're trying multiplies and in the other you're bumping up
to the next closest clock rate. I don't think you really need to do
that. Just test the (rate - 2%) and the rate. How about this (only
lightly tested):

    ser_clk = 0;
    maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT;
    div = 1;
    while (div < maxdiv) {
        mult = (unsigned long long)div * desired_clk;
        if (mult != (unsigned long)mult)
            break;

        two_percent = mult / 50;

        /*
         * Loop requesting (freq - 2%) and possibly (freq).
         *
         * We'll keep track of the lowest freq inexact match we found
         * but always try to find a perfect match. NOTE: this algorithm
         * could miss a slightly better freq if there's more than one
         * freq between (freq - 2%) and (freq) but (freq) can't be made
         * exactly, but that's OK.
         *
         * This absolutely relies on the fact that the Qualcomm clock
         * driver always rounds up.
         */
        test_freq = mult - two_percent;
        while (test_freq <= mult) {
            freq = clk_round_rate(clk, test_freq);

            /*
             * A dead-on freq is an insta-win. This implicitly
             * handles when "freq == mult"
             */
            if (!(freq % desired_clk)) {
                *clk_div = freq / desired_clk;
                return freq;
            }

            /*
             * Only time clock framework doesn't round up is if
             * we're past the max clock rate. We're done searching
             * if that's the case.
             */
            if (freq < test_freq)
                return ser_clk;

            /* Save the first (lowest freq) within 2% */
            if (!ser_clk && freq <= mult + two_percent) {
                ser_clk = freq;
                *clk_div = div;
            }

            /*
             * If we already rounded up past mult then this will
             * cause the loop to exit. If not then this will run
             * the loop a second time with exactly mult.
             */
            test_freq = max(freq + 1, mult);
        }

        /*
         * test_freq will always be bigger than mult by at least 1.
         * That means we can get the next divider with a DIV_ROUND_UP.
         * This has the advantage of skipping by a whole bunch of divs
         * If the clock framework already bypassed them.
         */
        div = DIV_ROUND_UP(test_freq, desired_clk);
        }

    return ser_clk;