On Thu, Jun 21, 2018 at 3:58 AM, Roger Quadros <rogerq@xxxxxx> wrote: > +Rafael > > On 20/06/18 18:30, Samuel Morris wrote: >> On Wed, Jun 20, 2018 at 8:58 AM, Roger Quadros <rogerq@xxxxxx> wrote: >>> Tony, >>> >>> On 20/06/18 13:29, Tony Lindgren wrote: >>>> Hi, >>>> >>>> * Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> [180620 05:55]: >>>>> Linux next (4.18.0-rc1-next-20180619) boot failed on beagle board x15. >>>> >>>> Bisect points to commit aece27a2f01b ("ata: ahci_platform: allow disabling of >>>> hotplug to save power"). >>>> >>>> Reverting the patch makes things work again. Any ideas what >>>> might be going wrong here? Things clearly idle but then there >>>> seems to be some register access with clocks disabled. >>> >>> The commit is doing this in probe. >>> >>> + pm_runtime_set_active(dev); >>> pm_runtime_enable(dev); >>> - pm_runtime_get_sync(dev); >>> + pm_runtime_forbid(dev); >>> >>> On OMAP, the device is not guaranteed to be active at probe and so we can't >>> say pm_runtime_set_active() and get rid of pm_runtime_get_sync(). >> >> Okay, by calling set_active(), I'm preventing the rpm_resume from >> completing that would normally happen in pm_runtime_forbid(). I assume >> you mean that there are parent devices that need to be resumed before > > Actually, in the OMAP case, the AHCI controller device isn't active when probe is called. > For other platforms this might not be the case. So we need to be careful here. > >> this device may be assumed active. I'm going to try removing the >> set_active(), then move that clause to the end of >> ahci_platform_init_host(). The pm_runtime_forbid() is effectively the >> same as get_sync(), it just also sets the runtime_auto flag to false. >> I don't think we should be saying the device is active until the host >> is initialized, so that seems like a better, common place for the >> pm_runtime init callbacks anyway. How does that sound? > > Device active and initialized are different things. If the device is powered up > and can be accessed it is active, even if it is not yet initialized. > I don't think we should club the two. > > Why do you need to call pm_runtime_set_active() at all in the probe sequence? What I meant was, I'm not going to call pm_runtime_set_active() anymore. The device status would be changed to RPM_ACTIVE at the end of rpm_resume() in either runtime_get() or runtime_forbid(). It is expected to be active after those have been called. Calling set_active() before the runtime_forbid() before was to prevent the device from actually calling the resume() callbacks, because the host had not yet been initialized, and resuming the host at that point was causing hangs. I was hoping that actually doing a runtime_resume() after the host has been initialized would work, and make all parent devices happy, though I am turning on a host that's already on and initialized. It should be just about the same as what was there before, just runtime_forbid() replacing get_sync(), and all that moved to the ahci_platform_init_host() function. Now that I think about it though, I want all the parent devices resumed before I resume the current device, and probably before host initialization, and I suspect that's why you need to call runtime_get_sync(). Maybe I should just put the ahci_platform_init_host() in the runtime_resume callback, and only run it the first time that resume() is called. That would require modifying all drivers that call ahci_platform_init_host() though. Or, I could just not call ahci_platform_resume_host() on the first runtime_resume(). That seems like the simplest solution. How about that? > > Documentation for pm_runtime_set_active() says, > "(it is only valid to use this function if 'power.runtime_error' is set > or 'power.disable_depth' is greater than zero);" > >> >>> >>> >>>> >>>>> Full log of trace please refer this link, >>>>> https://lkft.validation.linaro.org/scheduler/job/291535#L946 >>>>> >>>>> Here is back trace, >>>>> [ 3.561640] omap_rng 48090000.rng: Random Number Generator ver. 20 >>>>> [ 3.611949] brd: module loaded >>>>> [ 3.641298] loop: module loaded >>>>> [ 3.646435] at24 0-0050: 4096 byte 24c32 EEPROM, writable, 1 bytes/write >>>>> [ 3.658481] ------------[ cut here ]------------ >>>>> [ 3.663196] WARNING: CPU: 0 PID: 1 at >>>>> /srv/oe/build/tmp-rpb-glibc/work-shared/am57xx-evm/kernel-source/drivers/bus/omap_l3_noc.c:147 >>>>> l3_interrupt_handler+0x27c/0x39c >>>>> [ 3.678052] 44000000.ocp:L3 Custom Error: MASTER MPU TARGET L4_CFG >>>>> (Read): Data Access in User mode during Functional access >>>>> [ 3.689330] Modules linked in: >>>>> [ 3.692426] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>>>> 4.18.0-rc1-next-20180619 #1 >>>>> [ 3.699952] Hardware name: Generic DRA74X (Flattened Device Tree) >>>>> [ 3.706099] [<c04156f0>] (unwind_backtrace) from [<c040eec4>] >>>>> (show_stack+0x20/0x24) >>>>> [ 3.713897] [<c040eec4>] (show_stack) from [<c116a668>] >>>>> (dump_stack+0xb0/0xdc) >>>>> [ 3.721171] [<c116a668>] (dump_stack) from [<c0450a14>] (__warn+0x118/0x130) >>>>> [ 3.728270] [<c0450a14>] (__warn) from [<c0450a84>] >>>>> (warn_slowpath_fmt+0x58/0x74) >>>>> [ 3.735808] [<c0450a84>] (warn_slowpath_fmt) from [<c08be648>] >>>>> (l3_interrupt_handler+0x27c/0x39c) >>>>> [ 3.744742] [<c08be648>] (l3_interrupt_handler) from [<c04cd270>] >>>>> (__handle_irq_event_percpu+0x90/0x45c) >>>>> [ 3.754286] [<c04cd270>] (__handle_irq_event_percpu) from >>>>> [<c04cd678>] (handle_irq_event_percpu+0x3c/0x8c) >>>>> [ 3.764001] [<c04cd678>] (handle_irq_event_percpu) from >>>>> [<c04cd710>] (handle_irq_event+0x48/0x6c) >>>>> [ 3.772933] [<c04cd710>] (handle_irq_event) from [<c04d170c>] >>>>> (handle_fasteoi_irq+0xb0/0x164) >>>>> [ 3.781513] [<c04d170c>] (handle_fasteoi_irq) from [<c04cc344>] >>>>> (generic_handle_irq+0x34/0x44) >>>>> [ 3.790183] [<c04cc344>] (generic_handle_irq) from [<c04cca48>] >>>>> (__handle_domain_irq+0x6c/0xc4) >>>>> [ 3.798941] [<c04cca48>] (__handle_domain_irq) from [<c0402528>] >>>>> (gic_handle_irq+0x68/0xac) >>>>> [ 3.807347] [<c0402528>] (gic_handle_irq) from [<c04019bc>] >>>>> (__irq_svc+0x5c/0x7c) >>>>> [ 3.814874] Exception stack(0xee179c70 to 0xee179cb8) >>>>> [ 3.819963] 9c60: fc140000 >>>>> ee3e2f10 ee3a9810 00000000 >>>>> [ 3.828189] 9c80: 00000000 ee3e2f10 fc140004 fc140000 ee3a9800 >>>>> ee3a9810 00000000 ee179cdc >>>>> [ 3.836414] 9ca0: ee179ce0 ee179cc0 c0c3b364 c0c3b2ec 60000013 ffffffff >>>>> [ 3.843079] [<c04019bc>] (__irq_svc) from [<c0c3b2ec>] >>>>> (ahci_enable_ahci+0x24/0x70) >>>>> [ 3.850789] [<c0c3b2ec>] (ahci_enable_ahci) from [<c0c3b364>] >>>>> (ahci_save_initial_config+0x2c/0x388) >>>>> [ 3.859894] [<c0c3b364>] (ahci_save_initial_config) from >>>>> [<c0c3cbdc>] (ahci_platform_init_host+0x9c/0x4f4) >>>>> [ 3.869609] [<c0c3cbdc>] (ahci_platform_init_host) from >>>>> [<c0c3c33c>] (ahci_probe+0xa0/0xc0) >>>>> [ 3.878018] [<c0c3c33c>] (ahci_probe) from [<c0ba40f0>] >>>>> (platform_drv_probe+0x58/0xac) >>>>> [ 3.885989] [<c0ba40f0>] (platform_drv_probe) from [<c0ba1aec>] >>>>> (driver_probe_device+0x25c/0x338) >>>>> [ 3.894915] [<c0ba1aec>] (driver_probe_device) from [<c0ba1cbc>] >>>>> (__driver_attach+0xf4/0xf8) >>>>> [ 3.903411] [<c0ba1cbc>] (__driver_attach) from [<c0b9f9ac>] >>>>> (bus_for_each_dev+0x78/0xc4) >>>>> [ 3.911643] [<c0b9f9ac>] (bus_for_each_dev) from [<c0ba13b8>] >>>>> (driver_attach+0x2c/0x30) >>>>> [ 3.919696] [<c0ba13b8>] (driver_attach) from [<c0ba0cfc>] >>>>> (bus_add_driver+0x114/0x220) >>>>> [ 3.927752] [<c0ba0cfc>] (bus_add_driver) from [<c0ba2e2c>] >>>>> (driver_register+0x88/0x11c) >>>>> [ 3.935896] [<c0ba2e2c>] (driver_register) from [<c0ba4044>] >>>>> (__platform_driver_register+0x50/0x58) >>>>> [ 3.945003] [<c0ba4044>] (__platform_driver_register) from >>>>> [<c1aa9cf0>] (ahci_driver_init+0x24/0x28) >>>>> [ 3.954195] [<c1aa9cf0>] (ahci_driver_init) from [<c0403a8c>] >>>>> (do_one_initcall+0x9c/0x434) >>>>> [ 3.962517] [<c0403a8c>] (do_one_initcall) from [<c1a01410>] >>>>> (kernel_init_freeable+0x3b8/0x53c) >>>>> [ 3.971276] [<c1a01410>] (kernel_init_freeable) from [<c118551c>] >>>>> (kernel_init+0x18/0x128) >>>>> [ 3.979597] [<c118551c>] (kernel_init) from [<c04010ac>] >>>>> (ret_from_fork+0x14/0x28) >>>>> [ 3.987210] Exception stack(0xee179fb0 to 0xee179ff8) >>>>> [ 3.992298] 9fa0: 00000000 >>>>> 00000000 00000000 00000000 >>>>> [ 4.000524] 9fc0: 00000000 00000000 00000000 00000000 00000000 >>>>> 00000000 00000000 00000000 >>>>> [ 4.008750] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 >>>>> [ 4.015405] irq event stamp: 663130 >>>>> [ 4.018926] hardirqs last enabled at (663129): [<c118d9d4>] >>>>> _raw_spin_unlock_irqrestore+0x40/0x6c >>>>> [ 4.027939] hardirqs last disabled at (663130): [<c04019ac>] >>>>> __irq_svc+0x4c/0x7c >>>>> [ 4.035384] softirqs last enabled at (661486): [<c0402c98>] >>>>> __do_softirq+0x2a8/0x5e8 >>>>> [ 4.043262] softirqs last disabled at (661449): [<c04594a4>] >>>>> irq_exit+0x178/0x1a4 >>>>> [ 4.050826] ---[ end trace 095abac70926448f ]--- >>>>> >>>>> Best regards >>>>> Naresh Kamboju >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> -- >>> cheers, >>> -roger >>> >>> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. >>> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki > > -- > cheers, > -roger > > Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. > Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html