Re: [PATCH] [RFC] clk: stm32mp1: Keep RNG1 clock always running

Marek Vasut <marex@xxxxxxx> · Thu, 16 May 2024 22:01:37 +0200

On 5/16/24 4:35 PM, Gatien CHEVALLIER wrote:

Hi,

What if you add a trace in a random generation function in random.c?

Do you have a function name or line number for me ?

I put a trace in _get_random_bytes() in drivers/char/random.c. I'm not
100% sure but this should be the entry point when getting a random 
number.

You're right, there is a read attempt right before the hang, and 
__clk_is_enabled() returns 0 in stm32_read_rng() . In fact, it is 
the pm_runtime_get_sync() which is returning -EACCES instead of 
zero, and this is currently not checked so the failure is not 
detected before register access takes place, to register file with 
clock disabled, which triggers a hard hang.

I'll be sending a patch shortly, thanks for this hint !

Great news, indeed the return code isn't checked. Let's use
pm_runtime_resume_and_get().

Yes please.

I will wonder why we get EACCES though, that basically means we are 
suspending already. Is it safe to return -errno from rng read function 
in that case ?

The framework expects a function that can return an error code so I
don't see why not. Else the framework would have an issue.

I still haven't figured out what is happening.

Could it be that the kernel is getting entropy with hwrng_fillfn()
like it does periodically to feed the entropy pool and it happens at the
same time as your pm test sequence?

Possibly. I use script as init which contains basically #!/bin/sh , 
mount of a few filesystems like dev, proc, sys, and then the pm_test 
sequence to avoid wasting time booting full userspace.

FYI, I have been running your script with (echo devices > 
/sys/power/pm_test) for 5 hours now and haven't been able to reproduce 
the issue.

Maybe the 'devices' test is not enough and the deeper pm_test states 
have some sort of impact ?

After this, I'll try to reproduce the issue.

If you have a minute to test it on some ST MP15 board, that would 
be real nice. Thanks !

I tried to reproduce the issue you're facing on a STM32MP157C-DK2 no
SCMI on the 6.9-rc7 kernel tag. I uses OP-TEE and TF-A in the 
bootchain
but this should not have an impact here.

How did you manage to test using "echo core > /sys/power/pm_test"?
In kernel/power/suspend.c, enter_state(). If the pm_test_level is 
core,
then an error is fired with the following trace:
"Unsupported test mode for suspend to idle, please choose 
none/freezer/devices/platform."

Could this be firmware related ?

I've tried using "echo devices > /sys/power/pm_test" so that I can 
at least test that the driver is put to sleep then wakes up. I do not
reproduce your issue.

Can you try 'processors' ?

Given this:
#ifdef CONFIG_PM_DEBUG
         if (pm_test_level != TEST_NONE && pm_test_level <= TEST_CPUS) {
             pr_warn("Unsupported test mode for suspend to idle

You're supposed to be suspending to 'mem' , not 'idle' . Could that be 
it ?

Yes you're right, I've been missing that. I do not have "deep" available
in /sys/power/mem_sleep... not upstreamed yet maybe... Have you coded a
PSCI service for this in U-Boot?

I'm either missing something or I can't reproduce your setup.

The PSCI provider in U-Boot has been in place for years, there's no need 
to code anything, just compile it and that's all:

$ make stm32mp15_basic_defconfig && make -j`nproc`

This gets you u-boot-spl.stm32 and u-boot.itb as FSBL/SSBL .