On 5/16/24 22:01, Marek Vasut wrote:
On 5/16/24 4:35 PM, Gatien CHEVALLIER wrote:
Hi,
What if you add a trace in a random generation function in
random.c?
Do you have a function name or line number for me ?
I put a trace in _get_random_bytes() in drivers/char/random.c. I'm
not
100% sure but this should be the entry point when getting a random
number.
You're right, there is a read attempt right before the hang, and
__clk_is_enabled() returns 0 in stm32_read_rng() . In fact, it is
the pm_runtime_get_sync() which is returning -EACCES instead of
zero, and this is currently not checked so the failure is not
detected before register access takes place, to register file with
clock disabled, which triggers a hard hang.
I'll be sending a patch shortly, thanks for this hint !
Great news, indeed the return code isn't checked. Let's use
pm_runtime_resume_and_get().
Yes please.
I will wonder why we get EACCES though, that basically means we are
suspending already. Is it safe to return -errno from rng read
function in that case ?
The framework expects a function that can return an error code so I
don't see why not. Else the framework would have an issue.
I still haven't figured out what is happening.
Could it be that the kernel is getting entropy with hwrng_fillfn()
like it does periodically to feed the entropy pool and it happens at the
same time as your pm test sequence?
Possibly. I use script as init which contains basically #!/bin/sh ,
mount of a few filesystems like dev, proc, sys, and then the pm_test
sequence to avoid wasting time booting full userspace.
Ok,
The strangest thing is not being to enable the clock, maybe there's
something on the clock driver side. Tracking clock enable/disable
may lead to something.
FYI, I have been running your script with (echo devices >
/sys/power/pm_test) for 5 hours now and haven't been able to reproduce
the issue.
Maybe the 'devices' test is not enough and the deeper pm_test states
have some sort of impact ?
Maybe, I don't have the knowledge to confirm or invalidate this.
Tasks should be frozen before drivers are put to sleep so my instinct
would say no but you can't take it for granted :)
After this, I'll try to reproduce the issue.
If you have a minute to test it on some ST MP15 board, that would
be real nice. Thanks !
I tried to reproduce the issue you're facing on a STM32MP157C-DK2 no
SCMI on the 6.9-rc7 kernel tag. I uses OP-TEE and TF-A in the
bootchain
but this should not have an impact here.
How did you manage to test using "echo core > /sys/power/pm_test"?
In kernel/power/suspend.c, enter_state(). If the pm_test_level is
core,
then an error is fired with the following trace:
"Unsupported test mode for suspend to idle, please choose
none/freezer/devices/platform."
Could this be firmware related ?
I've tried using "echo devices > /sys/power/pm_test" so that I can
at least test that the driver is put to sleep then wakes up. I do not
reproduce your issue.
Can you try 'processors' ?
Given this:
#ifdef CONFIG_PM_DEBUG
if (pm_test_level != TEST_NONE && pm_test_level <=
TEST_CPUS) {
pr_warn("Unsupported test mode for suspend to idle
You're supposed to be suspending to 'mem' , not 'idle' . Could that
be it ?
Yes you're right, I've been missing that. I do not have "deep" available
in /sys/power/mem_sleep... not upstreamed yet maybe... Have you coded a
PSCI service for this in U-Boot?
I'm either missing something or I can't reproduce your setup.
The PSCI provider in U-Boot has been in place for years, there's no need
to code anything, just compile it and that's all:
$ make stm32mp15_basic_defconfig && make -j`nproc`
This gets you u-boot-spl.stm32 and u-boot.itb as FSBL/SSBL .
Ok thanks.
Best regards,
Gatien