21.02.2020 23:48, Daniel Lezcano пишет: > On 21/02/2020 21:21, Dmitry Osipenko wrote: >> 21.02.2020 23:02, Daniel Lezcano пишет: > > [ ... ] > >>>>>>>> + >>>>>>>> + /* >>>>>>>> + * The primary CPU0 core shall wait for the secondaries >>>>>>>> + * shutdown in order to power-off CPU's cluster safely. >>>>>>>> + * The timeout value depends on the current CPU frequency, >>>>>>>> + * it takes about 40-150us in average and over 1000us in >>>>>>>> + * a worst case scenario. >>>>>>>> + */ >>>>>>>> + do { >>>>>>>> + if (tegra_cpu_rail_off_ready()) >>>>>>>> + return 0; >>>>>>>> + >>>>>>>> + } while (ktime_before(ktime_get(), timeout)); >>>>>>> >>>>>>> So this loop will aggresively call tegra_cpu_rail_off_ready() and retry 3 >>>>>>> times. The tegra_cpu_rail_off_ready() function can be called thoushand of times >>>>>>> here but the function will hang 1.5s :/ >>>>>>> >>>>>>> I suggest something like: >>>>>>> >>>>>>> while (retries--i && !tegra_cpu_rail_off_ready()) >>>>>>> udelay(100); >>>>>>> >>>>>>> So <retries> calls to tegra_cpu_rail_off_ready() and 100us x <retries> maximum >>>>>>> impact. >>>>>> But udelay() also results into CPU spinning in a busy-loop, and thus, >>>>>> what's the difference? >>>>> >>>>> busy looping instead of register reads with all the hardware things involved behind. >>>> >>>> Please notice that this code runs only on an older Cortex-A9/A15, which >>>> doesn't support WFE for the delaying, and thus, CPU always busy-loops >>>> inside udelay(). >>>> >>>> What about if I'll add cpu_relax() to the loop? Do you think it it could >>>> have any positive effect? >>> >>> I think udelay() has a call to cpu_relax(). >> >> Yes, my point is that udelay() doesn't bring much benefit for us here >> because: >> >> 1. we want to enter into power-gated state as quick as possible and >> udelay() just adds an unnecessary delay >> >> 2. udelay() spins in a busy-loop until delay is expired, just like we're >> doing it in this function already > > In this case why not remove ktime_get() and increase the number of retries? Because the busy-loop performance depends on CPU's frequency, so we can't rely on a bare number of the retries.