Re: [PATCH v2 1/3] scsi: ufshcd: Update the set frequency to devfreq

Jeffrey Hugo <jeffrey.l.hugo@xxxxxxxxx> · Thu, 28 May 2020 07:53:38 -0600

On Tue, May 26, 2020 at 11:17 AM Asutosh Das (asd)
<asutoshd@xxxxxxxxxxxxxx> wrote:
>
> Hi Jeffrey
> On 5/25/2020 3:19 PM, Jeffrey Hugo wrote:
> > On Wed, Mar 25, 2020 at 12:29 PM Asutosh Das <asutoshd@xxxxxxxxxxxxxx> wrote:
> >>
> >> Currently, the frequency that devfreq provides the
> >> driver to set always leads the clocks to be scaled up.
> >> Hence, round the clock-rate to the nearest frequency
> >> before deciding to scale.
> >>
> >> Also update the devfreq statistics of current frequency.
> >>
> >> Signed-off-by: Asutosh Das <asutoshd@xxxxxxxxxxxxxx>
> >
> > This change appears to cause issues for the Lenovo Miix 630, as
> > identified by git bisect.
> >
>
> Thanks for reporting this.
>
> > On 5.6-final, My boot log looks normal.  On 5.7-rc7, the Lenovo Miix
> > 630 rarely boots, usually stuck in some kind of infinite printk loop.
> >
> > If I disable some of the UFS logging, I can capture this from the
> > logs, as soon as UFS inits -
> >
> > [    4.353860] ufshcd-qcom 1da4000.ufshc: ufshcd_intr: Unhandled
> > interrupt 0x00000000
> > [    4.359605] ufshcd-qcom 1da4000.ufshc: ufshcd_intr: Unhandled
> > interrupt 0x00000000
> > [    4.365412] ufshcd-qcom 1da4000.ufshc: ufshcd_check_errors:
> > saved_err 0x4 saved_uic_err 0x2
> > [    4.371121] ufshcd-qcom 1da4000.ufshc: hba->ufs_version = 0x210,
> > hba->capabilities = 0x1587001f
> > [    4.376846] ufshcd-qcom 1da4000.ufshc: hba->outstanding_reqs =
> > 0x100000, hba->outstanding_tasks = 0x0
> > [    4.382636] ufshcd-qcom 1da4000.ufshc: last_hibern8_exit_tstamp at
> > 0 us, hibern8_exit_cnt = 0
> > [    4.388368] ufshcd-qcom 1da4000.ufshc: No record of pa_err
> > [    4.394001] ufshcd-qcom 1da4000.ufshc: dl_err[0] = 0x80000001 at 3873626 us
> > [    4.399577] ufshcd-qcom 1da4000.ufshc: No record of nl_err
> > [    4.405053] ufshcd-qcom 1da4000.ufshc: No record of tl_err
> > [    4.410464] ufshcd-qcom 1da4000.ufshc: No record of dme_err
> > [    4.415747] ufshcd-qcom 1da4000.ufshc: No record of auto_hibern8_err
> > [    4.420950] ufshcd-qcom 1da4000.ufshc: No record of fatal_err
> > [    4.426013] ufshcd-qcom 1da4000.ufshc: No record of link_startup_fail
> > [    4.430950] ufshcd-qcom 1da4000.ufshc: No record of resume_fail
> > [    4.435786] ufshcd-qcom 1da4000.ufshc: No record of suspend_fail
> > [    4.440538] ufshcd-qcom 1da4000.ufshc: dev_reset[0] = 0x0 at 3031009 us
> > [    4.445199] ufshcd-qcom 1da4000.ufshc: No record of host_reset
> > [    4.449750] ufshcd-qcom 1da4000.ufshc: No record of task_abort
> > [    4.454214] ufshcd-qcom 1da4000.ufshc: clk: core_clk, rate: 50000000
> > [    4.458590] ufshcd-qcom 1da4000.ufshc: clk: core_clk_unipro, rate: 37500000
> >
> > I don't understand how this change is breaking things, but it clearly is for me.
> >
> > What kind of additional data would be useful to get to the bottom of this?
> >

It turns out that the unipro_core clock had no parent, and thus no
ability to scale.  Fixing that in GCC seems to have resolved this.  I
suspect the UFS clock scaling code attempted to scale the core clock,
didn't check that the clock could change rates, and went along
assuming the new rate was effective, thus putting the hardware into a
bad state.