Re: [PATCH 12/16] clk: qcom: clk-krait: add 8064 errata workaround

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 15, 2022 at 02:34:30PM -0700, Stephen Boyd wrote:
> Quoting Ansuel Smith (2022-03-14 05:43:20)
> > On Mon, Mar 14, 2022 at 11:20:21AM +0300, Dmitry Baryshkov wrote:
> > > On 13/03/2022 22:04, Ansuel Smith wrote:
> > > > Add 8064 errata workaround where the sec_src clock gating needs to be
> > > 
> > > Could you please be more specific whether the errata applies only to the
> > > ipq8064 or to the apq8064 too? 8064 is not specific enough.
> > >
> > 
> > That's a good question... Problem is that we really don't know the
> > answer. This errata comes from qsdk on an old sourcecode. I assume this
> > is specific to ipq8064 and apq8064 have different mux configuration.
> > 
> 
> I think it was some glitch that happened when the automatic clk gating
> was enabled during a switch. The automatic clk gating didn't know that
> software was running and switching the input so it killed the CPU and
> stopped the clk. That lead to hangs and super badness. I assume it was
> applicable to apq8064 as well because ipq8064 is basically apq8064 with
> the multimedia subsystem replaced by the networking subsystem. Also I
> wouldn't remember all these details because I worked on apq8064 but not
> so much on ipq8064 :)

Honest question. Do you remember other glitch present on the platform?
We are trying to bisect an instability problem and we still needs to
find the reason. We really can't understand if it's just a power
delivery problem or a scaling problem from muxes or other things.

The current problem is that after some time the device kernel panics
with a number of strange reason like invalid kernel paging and other
strange (or the device just freze and reboots, not even a crash log)
Many kernel panics reports the crash near the mux switch (like random
error right before the mux switch) So I suspect there is a problem
there. But due to the fact that is very random we have NO exact way to
repro it. I manage sometime, while playing with the code, to repo
similar kernel crash but still i'm not sure of the real cause.

I know it's OT but do you have any idea about it? If you remember
anything about it?
(To scale the freq i'm using a dedicated cpufreq driver that works this
way:
- We first scale the cache to the max freq across all core, we set the
  voltage
- We scale the cpu to the correct target.
This is all done under a lock. Do you see anything wrong in this logic?
To mee these random crash looks to be really related to something wrong
with the mux or with the cache set to a wrong state)

Thx for any suggestion about this.
(also I will update this commit and mention both apq and ipq in the
comments)

-- 
	Ansuel



[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux