On Thu, 29 Nov 2018 at 23:31, Lina Iyer <ilina@xxxxxxxxxxxxxx> wrote: > > Hi Ulf, > > On Thu, Nov 29 2018 at 10:50 -0700, Ulf Hansson wrote: > >When the hierarchical CPU topology is used and when a CPU has been put > >offline (hotplug), that same CPU prevents its PM domain and thus also > >potential master PM domains, from being powered off. This is because genpd > >observes the CPU's struct device to remain being active from a runtime PM > >point of view. > > > >To deal with this, let's decrease the runtime PM usage count by calling > >pm_runtime_put_sync_suspend() of the CPU's struct device when putting it > >offline. Consequentially, we must then increase the runtime PM usage for > >the CPU, while putting it online again. > > > >Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx> > >--- > > > >Changes in v10: > > - Make it work when the hierarchical CPU topology is used, which may be > > used both for OSI and PC mode. > > - Rework the code to prevent "BUG: sleeping function called from > > invalid context". > >--- > > drivers/firmware/psci/psci.c | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > >diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c > >index b03bccce0a5d..f62c4963eb62 100644 > >--- a/drivers/firmware/psci/psci.c > >+++ b/drivers/firmware/psci/psci.c > >@@ -15,6 +15,7 @@ > > > > #include <linux/acpi.h> > > #include <linux/arm-smccc.h> > >+#include <linux/cpu.h> > > #include <linux/cpuidle.h> > > #include <linux/errno.h> > > #include <linux/linkage.h> > >@@ -199,9 +200,20 @@ static int psci_cpu_suspend(u32 state, unsigned long entry_point) > > > > static int psci_cpu_off(u32 state) > > { > >+ struct device *dev; > > int err; > > u32 fn; > > > >+ /* > >+ * When the hierarchical CPU topology is used, decrease the runtime PM > >+ * usage count for the current CPU, as to allow other parts in the > >+ * topology to enter low power states. > >+ */ > >+ if (psci_dt_topology) { > >+ dev = get_cpu_device(smp_processor_id()); > >+ pm_runtime_put_sync_suspend(dev); > >+ } > >+ > > fn = psci_function_id[PSCI_FN_CPU_OFF]; > > err = invoke_psci_fn(fn, state, 0, 0); > > return psci_to_linux_errno(err); > >@@ -209,6 +221,7 @@ static int psci_cpu_off(u32 state) > > > > static int psci_cpu_on(unsigned long cpuid, unsigned long entry_point) > > { > >+ struct device *dev; > > int err; > > u32 fn; > > > >@@ -216,6 +229,13 @@ static int psci_cpu_on(unsigned long cpuid, unsigned long entry_point) > > err = invoke_psci_fn(fn, cpuid, entry_point, 0); > > /* Clear the domain state to start fresh. */ > > psci_set_domain_state(0); > >+ > >+ /* Increase runtime PM usage count if the hierarchical CPU toplogy. */ > >+ if (!err && psci_dt_topology) { > >+ dev = get_cpu_device(cpuid); > >+ pm_runtime_get_sync(dev); > > I booted with a single CPU on my SDM845 device and when I tried to > online CPU1 and I see a crash. Thanks for testing! If I understand correctly, that means that you haven't registered CPU1 using register_cpu(), hence there are no struct device created for it. It sound like a special case, but on the other hand we shouldn't crash, or course. I guess a simple check like this would help. if (dev) pm_runtime_get_sync(dev); ...and then we need a similar check in psci_cpu_off() to deal with putting the CPU offline. Could you try this and see if it helps? > > # echo 1 > /sys/devices/system/cpu/cpu1/online > > [ 86.339204] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000188 > [ 86.340195] Detected VIPT I-cache on CPU1 > [ 86.348075] Mem abort info: > [ 86.348092] GICv3: CPU1: found redistributor 100 region 0:0x0000000017a80000 > [ 86.352125] ESR = 0x96000006 > [ 86.352194] CPU1: Booted secondary processor 0x0000000100 [0x517f803c] > [ 86.354956] Exception class = DABT (current EL), IL = 32 bits > [ 86.377700] SET = 0, FnV = 0 > [ 86.380788] EA = 0, S1PTW = 0 > [ 86.383967] Data abort info: > [ 86.386882] ISV = 0, ISS = 0x00000006 > [ 86.390760] CM = 0, WnR = 0 > [ 86.393755] user pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____) > [ 86.400430] [0000000000000188] pgd=00000001f5233003, pud=00000001f5234003, pmd=0000000000000000 > [ 86.409203] Internal error: Oops: 96000006 [#1] PREEMPT SMP > [ 86.414824] Modules linked in: > [ 86.417915] CPU: 0 PID: 1533 Comm: sh Not tainted 4.20.0-rc3-30359-gff2e21952bd5 #782 > [ 86.425807] Hardware name: Qualcomm Technologies, Inc. SDM845 MTP (DT) > [ 86.432387] pstate: 80400005 (Nzcv daif +PAN -UAO) > [ 86.437233] pc : __pm_runtime_resume+0x20/0x74 > [ 86.441720] lr : psci_cpu_on+0x84/0x90 > [ 86.445498] sp : ffff00000db43a10 > [ 86.448842] x29: ffff00000db43a10 x28: ffff80017562b500 > [ 86.454200] x27: ffff000009159000 x26: 0000000000000055 > [ 86.459556] x25: 0000000000000000 x24: ffff0000092c4bc8 > [ 86.464913] x23: ffff000008fb8000 x22: ffff00000916a000 > [ 86.470269] x21: 0000000000000100 x20: ffff000009314190 > [ 86.475625] x19: 0000000000000000 x18: 0000000000000000 > [ 86.480979] x17: 0000000000000000 x16: 0000000000000000 > [ 86.486334] x15: 0000000000000000 x14: ffff000009162600 > [ 86.491690] x13: 0000000000000300 x12: 0000000000000010 > [ 86.497047] x11: ffffffffffffffff x10: ffffffffffffffff > [ 86.502399] x9 : 0000000000000001 x8 : 0000000000000000 > [ 86.507753] x7 : 0000000000000000 x6 : 0000000000000000 > [ 86.513108] x5 : 0000000000000000 x4 : 0000000000000000 > [ 86.518463] x3 : 0000000000000188 x2 : 0000800174385000 > [ 86.523820] x1 : 0000000000000004 x0 : 0000000000000000 > [ 86.529175] Process sh (pid: 1533, stack limit = 0x(____ptrval____)) > [ 86.535585] Call trace: > [ 86.538063] __pm_runtime_resume+0x20/0x74 > [ 86.542197] psci_cpu_on+0x84/0x90 > [ 86.545639] cpu_psci_cpu_boot+0x3c/0x6c > [ 86.549593] __cpu_up+0x68/0x210 > [ 86.552852] bringup_cpu+0x30/0xe0 > [ 86.556293] cpuhp_invoke_callback+0x84/0x1e0 > [ 86.560689] _cpu_up+0xe0/0x1d0 > [ 86.563862] do_cpu_up+0x90/0xb0 > [ 86.567118] cpu_up+0x10/0x18 > [ 86.570113] cpu_subsys_online+0x44/0x98 > [ 86.574079] device_online+0x68/0xac > [ 86.577685] online_store+0xa8/0xb4 > [ 86.581202] dev_attr_store+0x18/0x28 > [ 86.584908] sysfs_kf_write+0x40/0x48 > [ 86.588606] kernfs_fop_write+0xcc/0x1cc > [ 86.592563] __vfs_write+0x40/0x16c > [ 86.596078] vfs_write+0xa8/0x1a0 > [ 86.599424] ksys_write+0x58/0xbc > [ 86.602768] __arm64_sys_write+0x18/0x20 > [ 86.606733] el0_svc_common+0x94/0xf0 > [ 86.610433] el0_svc_handler+0x24/0x80 > [ 86.614215] el0_svc+0x8/0x7c0 > [ 86.617300] Code: aa0003f3 361000e1 91062263 f9800071 (885f7c60) > [ 86.623447] ---[ end trace 4573c3c0e0761290 ]--- > > >+ } > >+ > > return psci_to_linux_errno(err); > > } > > > >-- > >2.17.1 > > > > Thanks, > Lina