Quoting Imre Deak (2018-02-19 13:23:47) > On Mon, Feb 19, 2018 at 12:50:46PM +0000, Chris Wilson wrote: > > As the driver is called to handle circumstances beyond it's control, we > > cannot assume that the pm_runtime core is happy to see us. For example, > > if we are called from shrink_slab to free up our pages during suspend, > > rpm may be disabled and pm_runtime_if_in_use() decides to fail with > > -EINVAL rather than simply say no. This is expected to happen, so don't > > warn. For example, > > > > [ 217.429228] Suspending console(s) (use no_console_suspend to debug) > > [ 217.557179] sd 0:0:0:0: [sda] Synchronizing SCSI cache > > [ 217.559399] sd 0:0:0:0: [sda] Stopping disk > > [ 218.661567] i915 0000:00:02.0: Resetting chip after gpu hang > > [ 219.523879] ------------[ cut here ]------------ > > [ 219.524474] pm_runtime_get_if_in_use() failed: -22 > > [ 219.524817] WARNING: CPU: 1 PID: 14 at drivers/gpu/drm/i915/intel_runtime_pm.c:3351 intel_runtime_pm_get_if_in_use+0xe3/0x150 [i915] > > [ 219.524836] Modules linked in: vgem i915 snd_hda_codec_realtek snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec r8169 lpc_ich snd_hwdep mii snd_hda_core snd_pcm prime_numbers > > [ 219.525054] CPU: 1 PID: 14 Comm: cpuhp/1 Tainted: G U 4.16.0-rc1-g740f57c54ecf-kasan_6+ #1 > > [ 219.525070] Hardware name: /D510MO, BIOS MOPNV10J.86A.0311.2010.0802.2346 08/02/2010 > > [ 219.525294] RIP: 0010:intel_runtime_pm_get_if_in_use+0xe3/0x150 [i915] > > [ 219.525313] RSP: 0018:ffff880018f5edf8 EFLAGS: 00010286 > > [ 219.525344] RAX: dffffc0000000008 RBX: ffff880007fc0000 RCX: 0000000000000000 > > [ 219.525361] RDX: 0000000000000001 RSI: ffffffff850609c0 RDI: ffffffff872992a0 > > [ 219.525377] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 > > [ 219.525394] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880007fc0000 > > [ 219.525411] R13: ffff880018f5f0f8 R14: ffff880007fc8de8 R15: ffff880018f5f0f0 > > [ 219.525429] FS: 0000000000000000(0000) GS:ffff880019c80000(0000) knlGS:0000000000000000 > > [ 219.525446] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 219.525463] CR2: 0000564df7897e86 CR3: 0000000000d7c000 CR4: 00000000000006e0 > > [ 219.525478] Call Trace: > > [ 219.525734] i915_gem_shrink+0x841/0xb50 [i915] > > [ 219.525802] ? debug_check_no_locks_freed+0x2a0/0x2a0 > > [ 219.525842] ? trace_hardirqs_on_thunk+0x1a/0x1c > > [ 219.526083] ? i915_gem_shrinker_count+0x2f0/0x2f0 [i915] > > [ 219.526131] ? lock_acquire+0x138/0x3c0 > > [ 219.526157] ? lock_acquire+0x138/0x3c0 > > [ 219.526391] ? shrinker_lock+0x49/0x210 [i915] > > [ 219.526465] ? mutex_trylock+0x15c/0x1a0 > > [ 219.526694] ? shrinker_lock+0x49/0x210 [i915] > > [ 219.526969] ? i915_gem_shrinker_scan+0xc4/0x320 [i915] > > [ 219.527200] i915_gem_shrinker_scan+0xc4/0x320 [i915] > > [ 219.527448] ? i915_gem_shrinker_vmap+0x3a0/0x3a0 [i915] > > [ 219.527533] shrink_slab.part.18+0x2d0/0x8d0 > > [ 219.527613] ? unregister_shrinker+0x1f0/0x1f0 > > [ 219.527668] ? mem_cgroup_iter+0x37d/0xc50 > > [ 219.527728] shrink_node+0x882/0xbe0 > > [ 219.527847] ? shrink_node_memcg+0x11c0/0x11c0 > > [ 219.527882] ? mark_held_locks+0xa8/0xf0 > > [ 219.527931] ? trace_hardirqs_on_caller+0x33f/0x590 > > [ 219.527961] ? ktime_get+0xad/0x140 > > [ 219.528015] do_try_to_free_pages+0x2d3/0xd70 > > [ 219.528099] ? allow_direct_reclaim.part.23+0x1d0/0x1d0 > > [ 219.528132] ? shrink_node+0xbe0/0xbe0 > > [ 219.528213] try_to_free_pages+0x1cd/0x570 > > [ 219.528257] ? do_try_to_free_pages+0xd70/0xd70 > > [ 219.528355] __alloc_pages_nodemask+0xadf/0x2110 > > [ 219.528423] ? unwind_next_frame+0x870/0x1970 > > [ 219.528465] ? deref_stack_reg+0x97/0xc0 > > [ 219.528503] ? gfp_pfmemalloc_allowed+0x150/0x150 > > [ 219.528539] ? register_sched_domain_sysctl+0x23a/0x1b90 > > [ 219.528588] ? unwind_next_frame+0x138/0x1970 > > [ 219.528619] ? kthread+0x30a/0x3d0 > > [ 219.528677] ? __read_once_size_nocheck.constprop.4+0x10/0x10 > > [ 219.528698] ? deref_stack_reg+0xc0/0xc0 > > [ 219.528762] ? __save_stack_trace+0x6e/0xd0 > > [ 219.528822] depot_save_stack+0x3bc/0x430 > > [ 219.528870] kasan_kmalloc+0x142/0x170 > > [ 219.528912] ? __kmalloc+0xf7/0x340 > > [ 219.528935] ? register_sched_domain_sysctl+0x23a/0x1b90 > > [ 219.528957] ? partition_sched_domains+0x4d4/0x840 > > [ 219.528978] ? sched_cpu_deactivate+0x11b/0x150 > > [ 219.529001] ? cpuhp_invoke_callback+0x160/0x15f0 > > [ 219.529023] ? cpuhp_thread_fun+0x35e/0x710 > > [ 219.529044] ? smpboot_thread_fn+0x50a/0x7f0 > > [ 219.529065] ? kthread+0x30a/0x3d0 > > [ 219.529086] ? ret_from_fork+0x24/0x50 > > [ 219.529141] ? register_sched_domain_sysctl+0x23a/0x1b90 > > [ 219.529169] ? register_sched_domain_sysctl+0x23a/0x1b90 > > [ 219.529198] ? set_track+0x87/0x100 > > [ 219.529225] ? init_object+0x6e/0x80 > > [ 219.529275] ? ___slab_alloc.constprop.36+0x232/0x3e0 > > [ 219.529303] ? ___slab_alloc.constprop.36+0x232/0x3e0 > > [ 219.529325] ? register_sched_domain_sysctl+0x23a/0x1b90 > > [ 219.529410] ? mark_held_locks+0xa8/0xf0 > > [ 219.529453] ? register_sched_domain_sysctl+0x23a/0x1b90 > > [ 219.529479] ? trace_hardirqs_on_caller+0x33f/0x590 > > [ 219.529532] __kmalloc+0xf7/0x340 > > [ 219.529557] ? register_sched_domain_sysctl+0x23a/0x1b90 > > [ 219.529604] register_sched_domain_sysctl+0x23a/0x1b90 > > [ 219.529684] ? sched_debug_show+0x20/0x20 > > [ 219.529713] ? debug_object_activate+0x530/0x530 > > [ 219.529771] ? rcu_lockdep_current_cpu_online+0xdc/0x130 > > [ 219.529802] ? partition_sched_domains+0x4ae/0x840 > > [ 219.529825] ? rcu_read_lock_sched_held+0x10f/0x130 > > [ 219.529875] partition_sched_domains+0x4d4/0x840 > > [ 219.529955] ? sched_init_domains+0x110/0x110 > > [ 219.529981] ? __wait_rcu_gp+0x24f/0x390 > > [ 219.530054] sched_cpu_deactivate+0x11b/0x150 > > [ 219.530086] ? sched_cpu_activate+0x1e0/0x1e0 > > [ 219.530112] ? __call_rcu.constprop.53+0x680/0x680 > > [ 219.530132] ? call_rcu_bh+0x10/0x10 > > [ 219.530166] ? debug_check_no_locks_freed+0x2a0/0x2a0 > > [ 219.530201] ? trace_raw_output_rcu_utilization+0xa0/0xa0 > > [ 219.530267] ? trace_raw_output_rcu_utilization+0xa0/0xa0 > > [ 219.530337] ? rcu_lockdep_current_cpu_online+0xdc/0x130 > > [ 219.530370] ? sched_cpu_activate+0x1e0/0x1e0 > > [ 219.530397] cpuhp_invoke_callback+0x160/0x15f0 > > [ 219.530424] ? lock_acquire+0x138/0x3c0 > > [ 219.530445] ? lock_acquire+0x138/0x3c0 > > [ 219.530471] ? cpuhp_thread_fun+0xaf/0x710 > > [ 219.530507] ? pci_mmcfg_check_reserved+0x100/0x100 > > [ 219.530565] cpuhp_thread_fun+0x35e/0x710 > > [ 219.530618] ? cpuhp_complete_idle_dead+0x10/0x10 > > [ 219.530639] smpboot_thread_fn+0x50a/0x7f0 > > [ 219.530678] ? sort_range+0x20/0x20 > > [ 219.530709] ? __kthread_parkme+0xba/0x1f0 > > [ 219.530739] ? schedule+0x84/0x1a0 > > [ 219.530768] ? __kthread_parkme+0xbf/0x1f0 > > [ 219.530805] ? sort_range+0x20/0x20 > > [ 219.530831] kthread+0x30a/0x3d0 > > [ 219.530859] ? _kthread_create_on_node+0xb0/0xb0 > > [ 219.530900] ret_from_fork+0x24/0x50 > > [ 219.530999] Code: 01 00 00 00 85 c0 74 4a 89 e8 5b 5d c3 80 3d 48 37 4e 00 00 75 f2 89 c6 48 c7 c7 40 f0 61 c0 c6 05 36 37 4e 00 01 e8 ed 2a e1 c2 <0f> ff eb d9 80 3d 3f 37 4e 00 00 75 94 48 c7 c7 60 e8 61 c0 c6 > > [ 219.531880] ---[ end trace 18ec0139488ea0c8 ]--- > > [ 219.607967] IRQ 16: no longer affine to CPU1 > > [ 219.670291] IRQ 24: no longer affine to CPU2 > > [ 219.701489] IRQ 8: no longer affine to CPU3 > > [ 219.701529] IRQ 9: no longer affine to CPU3 > > [ 219.701582] IRQ 18: no longer affine to CPU3 > > [ 219.701640] IRQ 25: no longer affine to CPU3 > > [ 219.743857] cache: parent cpu1 should not be sleeping > > [ 219.784549] cache: parent cpu2 should not be sleeping > > [ 219.816041] cache: parent cpu3 should not be sleeping > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Imre Deak <imre.deak@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/intel_runtime_pm.c | 10 +++------- > > 1 file changed, 3 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c > > index 16790f2576ec..73ae17065e7c 100644 > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c > > @@ -3335,11 +3335,9 @@ void intel_runtime_pm_get(struct drm_i915_private *dev_priv) > > */ > > bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv) > > { > > - struct pci_dev *pdev = dev_priv->drm.pdev; > > - struct device *kdev = &pdev->dev; > > - > > if (IS_ENABLED(CONFIG_PM)) { > > - int ret = pm_runtime_get_if_in_use(kdev); > > + struct pci_dev *pdev = dev_priv->drm.pdev; > > + struct device *kdev = &pdev->dev; > > > > /* > > * In cases runtime PM is disabled by the RPM core and we get > > @@ -3347,9 +3345,7 @@ bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv) > > * function, since the power state is undefined. This applies > > * atm to the late/early system suspend/resume handlers. > > */ > > - WARN_ONCE(ret < 0, > > - "pm_runtime_get_if_in_use() failed: %d\n", ret); > > - if (ret <= 0) > > + if (pm_runtime_get_if_in_use(kdev) <= 0) > > return false; > > Ok, this case means the same then for the caller: don't access the HW. > The above comment needs to be updated too, with that: I felt the code comment hinting where the ret < 0 was coming from just worked, so I took your suggestion to add: @@ -3328,10 +3328,13 @@ void intel_runtime_pm_get(struct drm_i915_private *dev_priv) * @dev_priv: i915 device instance * * This function grabs a device-level runtime pm reference if the device is - * already in use and ensures that it is powered up. + * already in use and ensures that it is powered up. It is illegal to try + * and access the HW should intel_runtime_pm_get_if_in_use() report failure. * * Any runtime pm reference obtained by this function must have a symmetric * call to intel_runtime_pm_put() to release the reference again. + * + * Returns: True if the wakeref was acquired, or False otherwise. */ to the kerneldoc instead. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx