On Fri, May 20, 2022 at 5:53 PM Chanwoo Choi <cw00.choi@xxxxxxxxxxx> wrote: > > Hi Kevin, > > On 5/20/22 11:42 AM, Chen-Yu Tsai wrote: > > On Fri, May 20, 2022 at 9:28 AM Chanwoo Choi <cw00.choi@xxxxxxxxxxx> wrote: > >> > >> Hi Kevin, Chen-Yu, > >> > >> On 5/20/22 3:25 AM, Kevin Hilman wrote: > >>> Chen-Yu Tsai <wenst@xxxxxxxxxxxx> writes: > >>> > >>>> n Wed, May 18, 2022 at 8:03 AM Kevin Hilman <khilman@xxxxxxxxxx> wrote: > >>>>> > >>>>> Kevin Hilman <khilman@xxxxxxxxxx> writes: > >>>>> > >>>>>> Chen-Yu Tsai <wenst@xxxxxxxxxxxx> writes: > >>>>>> > >>>>>>> On Mon, May 16, 2022 at 8:43 AM Roger Lu <roger.lu@xxxxxxxxxxxx> wrote: > >>>>>>>> > >>>>>>>> The Smart Voltage Scaling(SVS) engine is a piece of hardware > >>>>>>>> which calculates suitable SVS bank voltages to OPP voltage table. > >>>>>>>> Then, DVFS driver could apply those SVS bank voltages to PMIC/Buck > >>>>>>>> when receiving OPP_EVENT_ADJUST_VOLTAGE. > >>>>>>>> > >>>>>>>> 1. SVS driver uses OPP adjust event in [1] to update OPP table voltage part. > >>>>>>>> 2. SVS driver gets thermal/GPU device by node [2][3] and CPU device by get_cpu_device(). > >>>>>>>> After retrieving subsys device, SVS driver calls device_link_add() to make sure probe/suspend callback priority. > >>>>>>>> > >>>>>>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git/commit/?h=opp/linux-next&id=25cb20a212a1f989385dfe23230817e69c62bee5 > >>>>>>>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git/commit/?h=opp/linux-next&id=b325ce39785b1408040d90365a6ab1aa36e94f87 > >>>>>>>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux.git/commit/?h=v5.16-next/dts64&id=a8168cebf1bca1b5269e8a7eb2626fb76814d6e2 > >>>>>>>> > >>>>>>>> Change since v24: > >>>>>>>> - Rebase to Linux 5.18-rc6 > >>>>>>>> - Show specific fail log in svs_platform_probe() to help catch which step fails quickly > >>>>>>>> - Remove struct svs_bank member "pd_dev" because all subsys device's power domain has been merged into one node like above [3] > >>>>>>>> > >>>>>>>> Test in below environment: > >>>>>>>> SW: Integration Tree [4] + Thermal patch [5] + SVS v25 (this patchset) > >>>>>>>> HW: mt8183-Krane > >>>>>>>> > >>>>>>>> [4] https://protect2.fireeye.com/v1/url?k=847bae75-e5f0bb43-847a253a-000babff9b5d-0b6f42041b9dea1d&q=1&e=37a26c43-8564-4808-9701-dc76d1ebbb27&u=https%3A%2F%2Fgithub.com%2Fwens%2Flinux%2Fcommits%2Fmt8183-cpufreq-cci-svs-test > >>>>>>> > >>>>>>> I've updated my branch to include all the latest versions of the relevant > >>>>>>> patch series: > >>>>>>> > >>>>>>> - anx7625 DPI bus type series v2 (so the display works) > >>>>>>> - MT8183 thermal series v9 (this seems to have been overlooked by the > >>>>>>> maintainer) > >>>>>>> - MTK SVS driver series v25 > >>>>>>> - devfreq: cpu based scaling support to passive governor series v5 > >>>>>>> - MTK CCI devfreq series v4 > >>>>>>> - MT8183 cpufreq series v7 > >>>>>>> - Additional WIP patches for panfrost MTK devfreq > >>>>>> > >>>>>> Thanks for preparing an integration branch Chen-Yu. > >>>>>> > >>>>>> I'm testing this on mt8183-pumpkin with one patch to add the CCI > >>>>>> regulator[1], and the defconfig you posted in a previous rev of this > >>>>>> series, but the CCI driver still causes a fault on boot[2] on my > >>>>>> platform. > >>>>>> > >>>>>> I mentioned in earlier reviews that I think there's potentially a race > >>>>>> between CCI and SVS loading since they are co-dependent. My hunch is > >>>>>> that this is still not being handled properly. > >>>>> > >>>>> Ah, actually it's crashing when I try to boot the platform with > >>>>> `maxcpus=4` on the cmdline (which I have to do because mt8183-pumpkin is > >>>>> unstable upstream with the 2nd cluster enabled.) > >> > >> This warning message is printed by 'WARN_ON(cpufreq_passive_unregister_notifier(devfreq))' > >> on devfreq passive governor. > >> > >> If the cpufreq drivers are not probed before of probing cci devfreq driver > >> with passive governor, passive governor shows this warning message. > >> Because passive governor with CPUFREQ_PARENT_DEV depends on the cpufreq driver > >> in order to get 'struct cpufreq_policy'[2]. > >> > >> [1] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/tree/drivers/devfreq/governor_passive.c?h=devfreq-testing#n339 > >> [2] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/tree/drivers/devfreq/governor_passive.c?h=devfreq-testing#n282 > >> > >> But, as I knew, this message might not stop the kernel. Just show the warning > >> message and then return -EPROBE_DEFER error. It means that maybe try to > >> probe the cci devfreq driver on late time of kernel booting > >> and then will be working. But, I need the full kernel booting log > >> and the booting sequence of between cpufreq and cci devfreq driver. > > > > Maybe just use a standard dev_warn() instead? WARN_ON causes all sorts > > of panicking in developers' minds. :p > > > >> In order to fix your issue, could you share the full booting log? > >> And if possible, please explain the more detailed something about this. > > > > The shortened version is that on an 8 core system, with maxcpus=4, > > only the first four cores are booted and have cpufreq associated. > > I've not actually used this mechanism, so I don't really know what > > happens if the other cores are brought up later with hotplug. Is > > cpufreq expected to attach to them? > > > > Maybe Kevin can add some more details. > > > > > > ChenYu > > > > > >>>>> > >>>>> The CCI driver should be a bit more robust about detecting > >>>>> available/online CPUs > >>>> > >>>> This all seems to be handled in the devfreq passive governor. > >>> > >>> Well, that's the initial crash. But the SVS driver will also go through > >>> its svs_mt8183_banks[] array (including both big & little clusters) and > >>> try to init SVS, so presumably that will have some problems also if only > >>> one cluster is enabled. > >>> > >>>> And presumably we'd like to have CCI devfreq running even if just one > >>>> core was booted. > >>> > >>> Yes, I assume so also. > >>> > >>>> Added Chanwoo for more ideas. > >>> > >>> OK, thanks. > >>> > >>> Kevin > > > I tested the passive governor with my temporary test code > on odroid-xu3 which contains the big.LITTLE cluster (Octa-core). > > > [Sequence of cpufreq/devfreq driver] > 1. Turn on all cpus > 2. Probed cpufreq driver > 3. Probed devfreq driver using passive governor with CPUFREQ_PARENT_DEV > > In my test case, there are no warning message during kernel booting. > Also when scaling the cpu frequency of cpus of big.LITTLE clusters, > temporary devfreq driver receives the notfication and then > calculate the target frequency of devfreq device by iterating online cpu. > > If there are any h/w constraints on your case, please let me know. Could you run your system with maxcpus=4 added to your cmdline? This is what Kevin was running. The current result is that the latter four cores aren't booted, so no cpufreq tied to them, and the passive governor will fail to get their cpufreq_policy. As mentioned before, the code path used to have a WARN_ON(). Now it's a dev_warn(). It will still fail initialization though. We're wondering if devfreq passive governor should be made to work even if not all cpu cores are available when it probes. Regards ChenYu