On Tue, Jan 09, 2024 at 01:56:30PM -0800, Elliot Berman wrote: > > > On 1/9/2024 1:44 PM, Brian Masney wrote: > > On Mon, Jan 08, 2024 at 03:35:55PM -0800, Elliot Berman wrote: > >> On 1/8/2024 12:50 PM, Brian Masney wrote: > >>> On Mon, Jan 08, 2024 at 11:44:35PM +0530, Shazad Hussain wrote: > >>>> I can see that gcc_ufs_phy_ice_core_clk needs the gcc_ufs_phy_gdsc to be > >>>> enabled before this particular clk is enabled. But that required > >>>> power-domain I do not see in the ice DT node. That can cause this > >>>> problem. > >>> > >>> Thank you! I'll work on and post a patch set as I find free time over > >>> the next week or two. > >> I think I observe the same issue on sm8650. Symptoms seem to be same as > >> you've described. I'll test out the following diff and see if things > >> seem more reliable: > >> > >> diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi > >> index fd4f9dac48a3..c9ea50834dc9 100644 > >> --- a/arch/arm64/boot/dts/qcom/sm8650.dtsi > >> +++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi > >> @@ -2526,6 +2526,7 @@ ice: crypto@1d88000 { > >> "qcom,inline-crypto-engine"; > >> reg = <0 0x01d88000 0 0x8000>; > >> > >> + power-domains = <&gcc UFS_PHY_GDSC>; > >> clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>; > >> }; > >> > >> > >> If yes, I can post a patch for sm8650 if no else has yet. > > > > The intermittent boot issue is still present against > > linux-next-20240109 with the following patch: > > > > --- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi > > @@ -1556,6 +1556,7 @@ ice: crypto@1d88000 { > > compatible = "qcom,sa8775p-inline-crypto-engine", > > "qcom,inline-crypto-engine"; > > reg = <0x0 0x01d88000 0x0 0x8000>; > > + power-domains = <&gcc UFS_PHY_GDSC>; > > clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>; > > }; > > > > Things have been a bit more reliable for me after adding the power-domains. > > Are you getting stuck at the same spot or somewhere else? > > I've been looking at a similar issue to [1], so I wonder if maybe you're > facing that instead. > > [1]: https://lore.kernel.org/linux-arm-msm/20240104101735.48694-1-laura.nao@xxxxxxxxxxxxx/T/#m39f7c80b59c750ee4c0082474c5c15b6055927ef So it could be that issue that I'm also encountering. Previously I could configure a timeout on dracut and it would drop me to a shell when the system failed to boot. That's how I was able to get the dmesg for the ice error. However, dracut did not always time out, and when that happened the system wouldn't respond over the serial console. Now the boot still hangs for me about 50% of the time, however I have not been able to get dracut to time out after probably 20 reboots. I have magic sysrq enabled in my kernel, however I haven't been able to get it to trigger when going through Beaker. Let me ask internally about sysrq to see if I can get an interesting stack dump. If I boot with the standard verbose logging, then the race condition doesn't occur and -next boots fine for me. Brian