On 21/02/2020 21:54, Dmitry Osipenko wrote: > 21.02.2020 23:48, Daniel Lezcano пишет: >> On 21/02/2020 21:21, Dmitry Osipenko wrote: >>> 21.02.2020 23:02, Daniel Lezcano пишет: >> >> [ ... ] >> >>>>>>>>> + >>>>>>>>> + /* >>>>>>>>> + * The primary CPU0 core shall wait for the secondaries >>>>>>>>> + * shutdown in order to power-off CPU's cluster safely. >>>>>>>>> + * The timeout value depends on the current CPU frequency, >>>>>>>>> + * it takes about 40-150us in average and over 1000us in >>>>>>>>> + * a worst case scenario. >>>>>>>>> + */ >>>>>>>>> + do { >>>>>>>>> + if (tegra_cpu_rail_off_ready()) >>>>>>>>> + return 0; >>>>>>>>> + >>>>>>>>> + } while (ktime_before(ktime_get(), timeout)); >>>>>>>> >>>>>>>> So this loop will aggresively call tegra_cpu_rail_off_ready() and retry 3 >>>>>>>> times. The tegra_cpu_rail_off_ready() function can be called thoushand of times >>>>>>>> here but the function will hang 1.5s :/ >>>>>>>> >>>>>>>> I suggest something like: >>>>>>>> >>>>>>>> while (retries--i && !tegra_cpu_rail_off_ready()) >>>>>>>> udelay(100); >>>>>>>> >>>>>>>> So <retries> calls to tegra_cpu_rail_off_ready() and 100us x <retries> maximum >>>>>>>> impact. >>>>>>> But udelay() also results into CPU spinning in a busy-loop, and thus, >>>>>>> what's the difference? >>>>>> >>>>>> busy looping instead of register reads with all the hardware things involved behind. >>>>> >>>>> Please notice that this code runs only on an older Cortex-A9/A15, which >>>>> doesn't support WFE for the delaying, and thus, CPU always busy-loops >>>>> inside udelay(). >>>>> >>>>> What about if I'll add cpu_relax() to the loop? Do you think it it could >>>>> have any positive effect? >>>> >>>> I think udelay() has a call to cpu_relax(). >>> >>> Yes, my point is that udelay() doesn't bring much benefit for us here >>> because: >>> >>> 1. we want to enter into power-gated state as quick as possible and >>> udelay() just adds an unnecessary delay >>> >>> 2. udelay() spins in a busy-loop until delay is expired, just like we're >>> doing it in this function already >> >> In this case why not remove ktime_get() and increase the number of retries? > > Because the busy-loop performance depends on CPU's frequency, so we > can't rely on a bare number of the retries. Why not if computed in the worst case scenario? Anyway, I'll let you give a try. -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog