On 06.11.19 23:36, Leonard Crestez wrote: > On 06.11.2019 13:59, Martin Kepplinger wrote: >> On 04.11.19 11:35, Abel Vesa wrote: >>> On 19-11-04 09:49:18, Martin Kepplinger wrote: >>>> On 30.10.19 09:08, Abel Vesa wrote: >>>>> On 19-10-30 07:11:37, Martin Kepplinger wrote: >>>>>> On 23.06.19 13:47, Martin Kepplinger wrote: >>>>>>> On 10.06.19 14:13, Abel Vesa wrote: >>>>>>>> This is another alternative for the RFC: >>>>>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2019%2F3%2F27%2F545&data=02%7C01%7Cleonard.crestez%40nxp.com%7C6ca438b3b9e44d70ac7608d762b0c030%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637086383589318475&sdata=NyFLkQ8PUfC7PGejDK7NBJoQu36ZfaYvg9yuJvHedzo%3D&reserved=0 >>>>>>>> >>>>>>>> This new workaround proposal is a little bit more hacky but more contained >>>>>>>> since everything is done within the irq-imx-gpcv2 driver. >>>>>>>> >>>>>>>> Basically, it 'hijacks' the registered gic_raise_softirq __smp_cross_call >>>>>>>> handler and registers instead a wrapper which calls in the 'hijacked' >>>>>>>> handler, after that calling into EL3 which will take care of the actual >>>>>>>> wake up. This time, instead of expanding the PSCI ABI, we use a new vendor SIP. >>>>>>>> >>>>>>>> I also have the patches ready for TF-A but I'll hold on to them until I see if >>>>>>>> this has a chance of getting in. >>>>>>> >>>>>> >>>>>> Hi Abel, >>>>>> >>>>>> Running this workaround doesn't seem to work anymore on 5.4-rcX. Linux >>>>>> doesn't boot, with ATF unchanged (includes your workaround changes). I >>>>>> can try to add more details to this... >>>>>> >>>>> >>>>> This is happening because the system counter is now enabled on 8mq. >>>>> And since the irq-imx-gpcv2 is using as irq_set_affinity the >>>>> irq_chip_set_affinity_parent. This is because the actual implementation >>>>> of the driver relies on GIC to set the right affinity. On a SoC >>>>> that has the wake_request signales linked to the power controller this >>>>> works fine. Since the system counter is actually the tick broadcast >>>>> device and the set affinity relies only on GIC, the cores can't be >>>>> woken up by the broadcast interrupt. >>>>> >>>>>> Have you tested this for 5.4? Could you update this workaround? Please >>>>>> let me know if I missed any earlier update on this (having a cpu-sleep >>>>>> idle state). >>>>>> >>>>> >>>>> The solution is to implement the set affinity in the irq-imx-gpcv2 driver >>>>> which would allow the gpc to wake up the target core when the broadcast >>>>> irq arrives. >>>>> >>>>> I have a patch for this. I just need to clean it up a little bit. >>>>> Unfortunately, it won't go upstream since everuone thinks the gic >>>>> should be the one to control the affinity. This obviously doesn't work >>>>> on 8mq. >>>>> >>>>> Currently, I'm at ELCE in Lyon. Will get back at the office tomorrow >>>>> and sned you what I have. >>>>> >>>> >>>> Hi Abel, >>>> >>>> Do you have any news on said patch for testing? That'd be great for my >>>> plannings. >>>> >>> >>> Sorry for the late answer. >>> >>> I'm dropping here the diff. >>> >>> Please keep in mind that this is _not_ an official solution. >>> >>> --- >>> drivers/irqchip/irq-imx-gpcv2.c | 42 ++++++++++++++++++++++++++++++++++++++++- >>> 1 file changed, 41 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/irqchip/irq-imx-gpcv2.c b/drivers/irqchip/irq-imx-gpcv2.c >>> index 01ce6f4..3150588 100644 >>> --- a/drivers/irqchip/irq-imx-gpcv2.c >>> +++ b/drivers/irqchip/irq-imx-gpcv2.c >>> @@ -41,6 +41,24 @@ static void __iomem *gpcv2_idx_to_reg(struct gpcv2_irqchip_data *cd, int i) >>> return cd->gpc_base + cd->cpu2wakeup + i * 4; >>> } >>> >>> +static void __iomem *gpcv2_idx_to_reg_cpu(struct gpcv2_irqchip_data *cd, >>> + int i, int cpu) >>> +{ >>> + u32 offset = GPC_IMR1_CORE0; >>> + switch(cpu) { >>> + case 1: >>> + offset = GPC_IMR1_CORE1; >>> + break; >>> + case 2: >>> + offset = GPC_IMR1_CORE2; >>> + break; >>> + case 3: >>> + offset = GPC_IMR1_CORE3; >>> + break; >>> + } >>> + return cd->gpc_base + offset + i * 4; >>> +} >>> + >>> static int gpcv2_wakeup_source_save(void) >>> { >>> struct gpcv2_irqchip_data *cd; >>> @@ -163,6 +181,28 @@ static void imx_gpcv2_irq_mask(struct irq_data *d) >>> irq_chip_mask_parent(d); >>> } >>> >>> +static int imx_gpcv2_irq_set_affinity(struct irq_data *d, >>> + const struct cpumask *dest, bool force) >>> +{ >>> + struct gpcv2_irqchip_data *cd = d->chip_data; >>> + void __iomem *reg; >>> + u32 val; >>> + int cpu; >>> + >>> + for_each_possible_cpu(cpu) { >>> + raw_spin_lock(&cd->rlock); >>> + reg = gpcv2_idx_to_reg_cpu(cd, d->hwirq / 32, cpu); >>> + val = readl_relaxed(reg); >>> + val |= BIT(d->hwirq % 32); >>> + if (cpumask_test_cpu(cpu, dest)) >>> + val &= ~BIT(d->hwirq % 32); >>> + writel_relaxed(val, reg); >>> + raw_spin_unlock(&cd->rlock); >>> + } >>> + >>> + return irq_chip_set_affinity_parent(d, dest, force); >>> +} >>> + >>> static struct irq_chip gpcv2_irqchip_data_chip = { >>> .name = "GPCv2", >>> .irq_eoi = irq_chip_eoi_parent, >>> @@ -172,7 +212,7 @@ static struct irq_chip gpcv2_irqchip_data_chip = { >>> .irq_retrigger = irq_chip_retrigger_hierarchy, >>> .irq_set_type = irq_chip_set_type_parent, >>> #ifdef CONFIG_SMP >>> - .irq_set_affinity = irq_chip_set_affinity_parent, >>> + .irq_set_affinity = imx_gpcv2_irq_set_affinity, >>> #endif >>> }; > > This is prone to race conditions. > > In NXP tree there is different gpcv2 irqchip driver which does all GPC > IMR register manipulation in TF-A through SMC calls. The cpuidle > workaround also manipulates the same registers and does so safely under > a lock. > > If OS also writes to same IMR register then set_affinity for SPIs 1-31 > can potentially race with one those cores being woken up. This is very > unlikely (set_affinity calls are rare) but in the worst case the system > could still hang on lost IPI. > >> I guess this diff does not apply when using this reworked change: >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.puri.sm%2FLibrem5%2Flinux-next%2Fcommit%2Fe59807ae0e236512761b751abc84a9b129d7fcda&data=02%7C01%7Cleonard.crestez%40nxp.com%7C6ca438b3b9e44d70ac7608d762b0c030%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637086383589318475&sdata=Mf%2BFtqFSG4xHL3IGPrD%2FOweR8qoJHV0IKuziPIUK%2Bsw%3D&reserved=0 >> which has worked for me when running 5.3. >> >> At least on 5.4-rc5, using your change, I still get >> >> cat /sys/devices/system/cpu/cpuidle/current_driver >> none > > This reads "psci_idle" for me in linux-next on imx8mm. Your problem > seems to be related to probing the cpuidle driver, not related to any > hardware workarounds. thanks, I see the "psci_idle" driver too, but I'm not able to boot from flashed emmc when having `ARM_PSCI_CPUIDLE` enabled! The logs below are both the last logs that get printed when startup hangs: ``` [ 1.638207] imx-cpufreq-dt imx-cpufreq-dt: cpu speed grade 3 mkt segment 0 supported-hw 0x8 0x1 [ 1.683487] mmc1: SDHCI controller on 30b50000.mmc [30b50000.mmc] using ADMA [ 1.695528] input: gpio-keys as /devices/platform/gpio-keys/input/input0 [ 1.708037] input: bd718xx-pwrkey as /devices/platform/soc@0/soc@0:bus@30800000/30a20000.i2c/i2c-0/0-004b/gpio-keys.0.auto/input/input1 [ 1.721939] snvs_rtc 30370000.snvs:snvs-rtc-lp: setting system clock to 1970-01-01T00:00:00 UTC (0) [ 1.723543] mmc1: new high speed SDIO card at address fffd ``` but the psci checker (when configured-in) seems to be ok: ``` [ 1.717281] imx-cpufreq-dt imx-cpufreq-dt: cpu speed grade 3 mkt segment 0 supported-hw 0x8 0x1 [ 1.763172] mmc1: SDHCI controller on 30b50000.mmc [30b50000.mmc] using ADMA [ 1.775368] input: gpio-keys as /devices/platform/gpio-keys/input/input1 [ 1.784397] input: bd718xx-pwrkey as /devices/platform/soc@0/soc@0:bus@30800000/30a20000.i2c/i2c-0/0-004b/gpio-keys.0.auto/input/input2 [ 1.798160] snvs_rtc 30370000.snvs:snvs-rtc-lp: setting system clock to 1970-01-01T00:00:00 UTC (0) [ 1.807668] psci_checker: PSCI checker started using 4 CPUs [ 1.813500] psci_checker: Starting hotplug tests [ 1.818351] psci_checker: Trying to turn off and on again all CPUs [ 1.826388] IRQ 6: no longer affine to CPU0 [ 1.826805] CPU0: shutdown [ 1.834060] psci: CPU0 killed. [ 1.840096] CPU1: shutdown [ 1.842938] psci: CPU1 killed. [ 1.848633] CPU2: shutdown [ 1.851500] psci: CPU2 killed. [ 1.856376] Detected VIPT I-cache on CPU0 [ 1.856407] GICv3: CPU0: found redistributor 0 region 0:0x0000000038880000 [ 1.856459] CPU0: Booted secondary processor 0x0000000000 [0x410fd034] [ 1.862897] mmc1: new high speed SDIO card at address fffd [ 1.882136] Detected VIPT I-cache on CPU1 [ 1.882155] GICv3: CPU1: found redistributor 1 region 0:0x00000000388a0000 [ 1.882186] CPU1: Booted secondary processor 0x0000000001 [0x410fd034] [ 1.902604] Detected VIPT I-cache on CPU2 [ 1.902624] GICv3: CPU2: found redistributor 2 region 0:0x00000000388c0000 [ 1.902653] CPU2: Booted secondary processor 0x0000000002 [0x410fd034] [ 1.921604] psci_checker: Trying to turn off and on again group 0 (CPUs 0-3) [ 1.930565] IRQ 6: no longer affine to CPU0 [ 1.930691] CPU0: shutdown [ 1.937961] psci: CPU0 killed. [ 1.942402] IRQ 6: no longer affine to CPU1 [ 1.942518] CPU1: shutdown [ 1.949759] psci: CPU1 killed. [ 1.954370] CPU2: shutdown [ 1.957249] psci: CPU2 killed. [ 1.961582] Detected VIPT I-cache on CPU0 [ 1.961600] GICv3: CPU0: found redistributor 0 region 0:0x0000000038880000 [ 1.961632] CPU0: Booted secondary processor 0x0000000000 [0x410fd034] [ 1.981892] Detected VIPT I-cache on CPU1 [ 1.981910] GICv3: CPU1: found redistributor 1 region 0:0x00000000388a0000 [ 1.981941] CPU1: Booted secondary processor 0x0000000001 [0x410fd034] [ 2.002301] Detected VIPT I-cache on CPU2 [ 2.002319] GICv3: CPU2: found redistributor 2 region 0:0x00000000388c0000 [ 2.002348] CPU2: Booted secondary processor 0x0000000002 [0x410fd034] [ 2.021288] psci_checker: Hotplug tests passed OK [ 2.026241] psci_checker: Starting suspend tests (10 cycles per state) [ 2.033683] psci_checker: CPU 1 entering suspend cycles, states 1 through 1 [ 2.033685] psci_checker: CPU 3 entering suspend cycles, states 1 through 1 [ 2.033687] psci_checker: CPU 0 entering suspend cycles, states 1 through 1 [ 2.033689] psci_checker: CPU 2 entering suspend cycles, states 1 through 1 [ 2.091607] psci_checker: CPU 0 suspend test results: success 10, shallow states 0, errors 0 [ 2.100497] psci_checker: CPU 1 suspend test results: success 10, shallow states 0, errors 0 [ 2.109361] psci_checker: CPU 2 suspend test results: success 10, shallow states 0, errors 0 [ 2.118227] psci_checker: CPU 3 suspend test results: success 10, shallow states 0, errors 0 [ 2.127106] psci_checker: Suspend tests passed OK [ 2.132030] psci_checker: PSCI checker completed ``` (also when booted (via SDP) , I can't wake up from S3 or reboot.) All the above worked with v5.3. Do you know what I could be doing wrong on 5.4? thanks! martin > >> But also when trying to rewrite your patch against irq-gic-v3.c at least >> nothing changes for me (I might have done that wrong as well though). >> >> What needs to change (in order to have the cpu-sleep state / idle >> driver) based on the above "reworked" workaround? >> >> Could the config have changed? CONFIG_ARM_CPUIDLE should be the only >> needed path, or did things change there in 5.4? > > It seems there were some recent cleanups in the cpuidle psci core code, > maybe you need config updates? > > https://patchwork.kernel.org/cover/11052723/ ARM_CPUIDLE is basically replaced with ARM_PSCI_CPUIDLE > >> I know all this is no real solution, but currently the only way to have >> said sleep state on top of mainline. so be it for now. > Can you use the gpcv2 driver from NXP tree? > > -- > Regards, > Leonard >