On Thu, 15 Jun 2023 at 21:39, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote: > > On Thu, 15 Jun 2023 at 20:33, Krzysztof Kozlowski > <krzysztof.kozlowski@xxxxxxxxxx> wrote: > > > > On 15/06/2023 15:47, Amit Pundir wrote: > > > On Thu, 15 Jun 2023 at 00:38, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote: > > >> > > >> On Thu, 15 Jun 2023 at 00:17, Krzysztof Kozlowski > > >> <krzysztof.kozlowski@xxxxxxxxxx> wrote: > > >>> > > >>> On 14/06/2023 20:18, Linux regression tracking (Thorsten Leemhuis) wrote: > > >>>> On 02.06.23 18:12, Amit Pundir wrote: > > >>>>> Move lvs1 and lvs2 regulator nodes up in the rpmh-regulators > > >>>>> list to workaround a boot regression uncovered by the upstream > > >>>>> commit ad44ac082fdf ("regulator: qcom-rpmh: Revert "regulator: > > >>>>> qcom-rpmh: Use PROBE_FORCE_SYNCHRONOUS""). > > >>>>> > > >>>>> Without this fix DB845c fail to boot at times because one of the > > >>>>> lvs1 or lvs2 regulators fail to turn ON in time. > > >>>> > > >>>> /me waves friendly > > >>>> > > >>>> FWIW, as it's not obvious: this... > > >>>> > > >>>>> Link: https://lore.kernel.org/all/CAMi1Hd1avQDcDQf137m2auz2znov4XL8YGrLZsw5edb-NtRJRw@xxxxxxxxxxxxxx/ > > >>>> > > >>>> ...is a report about a regression. One that we could still solve before > > >>>> 6.4 is out. One I'll likely will point Linus to, unless a fix comes into > > >>>> sight. > > >>>> > > >>>> When I noticed the reluctant replies to this patch I earlier today asked > > >>>> in the thread with the report what the plan forward was: > > >>>> https://lore.kernel.org/all/CAD%3DFV%3DV-h4EUKHCM9UivsFHRsJPY5sAiwXV3a1hUX9DUMkkxdg@xxxxxxxxxxxxxx/ > > >>>> > > >>>> Dough there replied: > > >>>> > > >>>> ``` > > >>>> Of the two proposals made (the revert vs. the reordering of the dts), > > >>>> the reordering of the dts seems better. It only affects the one buggy > > >>>> board (rather than preventing us to move to async probe for everyone) > > >>>> and it also has a chance of actually fixing something (changing the > > >>>> order that regulators probe in rpmh-regulator might legitimately work > > >>>> around the problem). That being said, just like the revert the dts > > >>>> reordering is still just papering over the problem and is fragile / > > >>>> not guaranteed to work forever. > > >>>> ``` > > >>>> > > >>>> Papering over obviously is not good, but has anyone a better idea to fix > > >>>> this? Or is "not fixing" for some reason an viable option here? > > >>>> > > >>> > > >>> I understand there is a regression, although kernel is not mainline > > >>> (hash df7443a96851 is unknown) and the only solutions were papering the > > >>> problem. Reverting commit is a temporary workaround. Moving nodes in DTS > > >>> is not acceptable because it hides actual problem and only solves this > > >>> one particular observed problem, while actual issue is still there. It > > >>> would be nice to be able to reproduce it on real mainline with normal > > >>> operating system (not AOSP) - with ramdiks/without/whatever. So far no > > >>> one did it, right? > > >> > > >> No, I did not try non-AOSP system yet. I'll try it tomorrow, if that > > >> helps. With mainline hash. > > > > > > Hi, here is the crash report on db845c running vanilla v6.4-rc6 with a > > > debian build https://bugs.linaro.org/attachment.cgi?id=1142 > > > > > > And fwiw here is the db845c crash log with AOSP running vanilla > > > v6.4-rc6 https://bugs.linaro.org/attachment.cgi?id=1141 > > > > > > Regards, > > > Amit Pundir > > > > > > PS: rootfs in this bug report doesn't matter much because I'm loading > > > all the kernel modules from a ramdisk and in the case of a crash the > > > UFS doesn't probe anyway. > > > > I just tried current next with defconfig (I could not find your config, > > neither here, nor in your previous mail thread nor in bugzilla). Also > > with REGULATOR_QCOM_RPMH as module. > > > > I tried also v6.4-rc6 - also defconfig with default and module > > REGULATOR_QCOM_RPMH. > > > > All the cases work on my RB3 - no warnings reported. > > > > If you do not use defconfig, then in all reports please mention the > > differences (the best) or at least attach it. > > Argh.. Sorry about that. Big mistake from my side. I did want to > upload my defconfig but forgot. Defconfig plays a key role because, as > I mentioned in one of my previous email, it is a timing/race bug and > if I do any much changes in my defconfig (i.e. enable ftrace for > example or as little as add printk in qcom_rpmh_regulator code) then I > can't reproduce this bug. So needless to say that I can't reproduce > this bug with default arm64 defconfig. > > Please find my custom (but upstream) defconfig here > https://bugs.linaro.org/attachment.cgi?id=1143 and prebuilt binaries > here https://people.linaro.org/~amit.pundir/db845c-userdebug/rpmh_bug/. > "fastboot flash boot ./boot.img-6.4-rc6 reboot" and/or a few (<5) > reboots should be enough to trigger the crash. > > I have downloaded the initrd from here > https://snapshots.linaro.org/96boards/dragonboard845c/linaro/debian/569/initrd.img-5.15.0-qcomlt-arm64 > but edited ramdisk/init to run "load_module" function early in the > boot and ramdisk/conf/initramfs.conf has "MODULES=list" instead of > "MODULES=most", where all the kernel modules are listed at > /etc/initramfs-tools/modules. Sorry it is ramdisk/conf/modules not ramdisk/etc/initramfs-tools/modules. > > Regards, > Amit Pundir