Re: [PATCH v2 2/2] mmc: mediatek: extend number of tuning steps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Il 29/11/23 04:16, Axe Yang (杨磊) ha scritto:
On Tue, 2023-11-28 at 11:20 +0100, AngeloGioacchino Del Regno wrote:
Il 28/11/23 10:38, Axe Yang (杨磊) ha scritto:
On Tue, 2023-11-28 at 09:53 +0100, AngeloGioacchino Del Regno
Il 28/11/23 08:01, Axe Yang ha scritto:
Previously, during the MSDC calibration process, a full clock
actually not be covered, which in some cases didn't yield the
results and could cause CRC errors. This problem is
evident when MSDC is used as an SDIO host. In fact, MSDC
tuning up to a maximum of 64 steps, but by default, the step
is 32. By increase the tuning step, we are more likely to cover
parts of a clock cycle, and get better calibration result.

To illustrate, when tuning 32 steps, if the obtained window has
near the middle, like this: 0xffc07ff (hex), then the selected
will be the 6 (counting from right to left).

(32 <- 1)
1111 1111 1100 0000 0000 0111 11(1)1 1111

However, if we tune 64 steps, the window obtained may look like
0xfffffffffffc07ff. The final selected delay will be 44, which
safer as it is further away from the hole:

(64 <- 1)
1111 ... (1)111 1111 1111 1111 1111 1100 0000 0000 0111 1111

In this case, delay 6 selected through 32 steps tuning is
not optimal, and this delay is closer to the hole, using it
easily cause CRC problems.

You will need to configure property "mediatek,tuning-step" in
dts node to 64 to extend the steps.

If we can run 64 tuning steps, why should we run 32?

Why isn't it just better to *always* run 64 tuning steps, on SoCs
supporting that?


Hi Angelo,

That is a good question. The benefit of preserving 32 steps tuning
that it can save time in certain scenarios.

On some platforms, when the delay selected through 64 steps tuning
very close to that chosen through 32 steps, we can reduce the
step from 64 to 32. This can save time sending the tuning block

Thus using 32 steps tuning can save kernel boot up time.

Another case where time can be saved is when accessing the RPMB
partition of eMMC. Each time switch to RPMB partition, there is a
retune action, causing a certain drop in performance. If we are
that the results of 32 steps tuning are usable and we use it, this
in a sense also guarantee performance when accessing the RPMB

Thanks for this explanation! Though, I have some more questions...

...regarding boot up time, how much time are we talking about?

Luckily, I have a platform at hand that can be used for experiments/
Below are the results from testing on this platform:

[    2.431993][T1200180] kworker/2:21: mtk-msdc bootdevice:
[name:mtk_sd&]Start tuning
[    2.434950][T1200180] kworker/2:21: mtk-msdc bootdevice:
[name:mtk_sd&]Tuning finished
[    2.435957][T1200180] kworker/2:21: mtk-msdc bootdevice:
[name:mtk_sd&]phase: [map:00000000ffffffc0] [maxlen:26] [final:19]
[    2.462375][T1200180] kworker/2:21: [name:mmc_core&]mmc0: new HS400
MMC card at address 0001
[    2.519863][T1300069] kworker/3:1: mtk-msdc 11250000.mmc:
[name:mtk_sd&]Start tuning
[    2.526271][T1300069] kworker/3:1: mtk-msdc 11250000.mmc:
[name:mtk_sd&]Tuning finished
[    2.527288][T1300069] kworker/3:1: mtk-msdc 11250000.mmc:
[name:mtk_sd&]phase: [map:ffffffffffff003f] [maxlen:48] [final:40]
[    2.532269][T1300069] kworker/3:1: [name:mmc_core&]mmc2: new ultra
high speed SDR104 SDIO card at address 0001

As the kernel log indicates, it took 3 ms for eMMC to tune 32 steps,
while it took about 7 ms for SDIO to tune 64 steps. I have to admit,
when it comes to saving boot up time, the benefits of reducing step
form 64 to 32 are quite minimal. Just as you said, especially when
async probe is enabled.

That's great to know, and it's *truly* nice information that you can
put into the commit message, as this completes the analysis of this

Can you mention that in the commit message for v4 of this commit?

"As per measurements taken on MT(xxxx), the tuning phase will take:
 eMMC - 32 steps: ~3ms
        64 steps: xxxx
 SDIO - 32 steps: xxxx
        64 steps: ~7ms

...but while this won't prolong boot times by any meaningful amount
of time, for eMMC, it should still be preferred to use 32 steps tuning
because otherwise we lose performance for RPMB I//O, which requires
re-tuning for each access.

I'm asking because while now I see - and agree - on using 32-steps
on eMMC to guarantee performance during RPMB access, as far as I
there is no RPMB partition on SD/MicroSD cards (and, of course, SDIO

If the boot performance impact isn't big, as in, up to ~100
milliseconds is
not big at all (especially with async probe!), we can definitely
avoid the
addition of a devicetree property for 32-steps tuning, hence use a
selection strategy such that:
   - On eMMC devices, always perform 32-steps tuning (hence no boot
   - On SD cards and SDIO, always perform 64-steps tuning

eMMC could also potentially have CRC issue if only tune 32-steps,
albeit with a lower likelihood. The precondition for using 32-steps
tuning is that it could provide roughly the same valid results as using
64-steps tuning. So taking everything into account, controlling the
tuning step as needed through the use of dts property seems to be a
more flixible approach.

Yes but since the only performance concern is about eMMC RPMB access,
we could at least make this 64 steps as *default* for SD/SDIO, and
32 steps as default for eMMC.
Device tree would be an override of those default values.

Can we set 64 as default for SD/SDIO, 32 as default for eMMC, and then use
the device tree to override those defaults?


[Index of Archives]     [Linux Memonry Technology]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux