On Tue, Jan 17, 2023 at 11:26:29AM +0100, Peter Zijlstra wrote:
On Mon, Jan 16, 2023 at 04:59:04PM +0000, Mark Rutland wrote:
I'm sorry to have to bear some bad news on that front. :(
Moo, something had to give..
IIUC what's happenign here is the PSCI cpuidle driver has entered idle and RCU
is no longer watching when arm64's cpu_suspend() manipulates DAIF. Our
local_daif_*() helpers poke lockdep and tracing, hence the call to
trace_hardirqs_off() and the RCU usage.
Right, strictly speaking not needed at this point, IRQs should have been
traced off a long time ago.
I think we need RCU to be watching all the way down to cpu_suspend(), and it's
cpu_suspend() that should actually enter/exit idle context. That and we need to
make cpu_suspend() and the low-level PSCI invocation noinstr.
I'm not sure whether 32-bit will have a similar issue or not.
I'm not seeing 32bit or Risc-V have similar issues here, but who knows,
maybe I missed somsething.
In any case, the below ought to cure the ARM64 case and remove that last
known RCU_NONIDLE() user as a bonus.
Thanks for the fix. I tested the series and did observe the same splat
with both DT and ACPI boot(they enter idle in different code paths). Thanks
to Mark for reminding me about ACPI. With this fix, I see the splat is
gone in both DT(cpuidle-psci.c) and ACPI(acpi_processor_idle.c).
You can add:
Tested-by: Sudeep Holla <sudeep.holla@xxxxxxx>
--
Regards,
Sudeep