Re: Call trace on 4.12.0-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, May 18, 2017 at 12:08:23PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 17-05-17 11:36, Ville Syrjälä wrote:
> >On Tue, May 16, 2017 at 10:43:39PM +0200, Hans de Goede wrote:
> >>Hi,
> >>
> >>On 05/16/2017 09:55 PM, FKr wrote:
> >>>Hi,
> >>>I'm using 4.12.0-rc1 from https://github.com/jwrdegoede/linux-sunxi and got
> >>>the following weird trace yesterday. Previously I've been getting output
> >>>similar to https://www.spinics.net/lists/intel-gfx/msg127638.html, some boots
> >>>on 4.12.0-rc1  I don't get any trace at all.
> >>
> >>This is really weird, we are getting the error while we are trying to
> >>acquire the wakelock ... ? Or do we need some other lock before we can
> >>take the wakelock ?
> >
> >You would also need the runtime pm reference. IIRC we unregister the
> >notifier in runtime suspend, so I think intel_runtime_pm_get_noresume()
> >should be OK in this case. Imre?
> 
> Thank you for the reply and you're right about not registering the
> notifier on runtime-resume I will send a fix for that.
> 
> Now about the oops this thread is about, I believe this is triggered
> by a race condition, intel_runtime_pm_put() does:
> 
>         atomic_dec(&dev_priv->pm.wakeref_count);
> 
>         pm_runtime_mark_last_busy(kdev);
>         pm_runtime_put_autosuspend(kdev);
> 
> And pm_runtime_put_autosuspend() calls intel_runtime_suspend() from
> a workqueue, so there is ample of time between the atomic_dec and
> intel_runtime_suspend() unregistering the notifier for this oops to
> happen.

wakeref_count is used to catch any HW access without calling
intel_runtime_pm_get() first. This is the majority of cases, for the
exceptions you need to synchronize manually wrt. the HW access in the
runtime suspend hook like in the case of the pmic lock and call
disable_rpm_wakeref_asserts() for the HW access.

> So this really is a false-positive.
> 
> Which I believe you already realized and is why you suggested using
> intel_runtime_pm_get_noresume().
> 
> It took me a while to realize that this is a false positive and all we
> really want to do is silence the message and that we don't really need
> to take a runtime_pm reference, because we unregister the notifier
> (yes I wrote those bits myself, but I've done a lot of other stuff
> since).
> 
> Unfortunately calling intel_runtime_pm_get_noresume() as you suggest
> will not work because it contains a assert_rpm_wakelock_held()
> call itself.
> 
> So I believe that the only way to silence the false-postive WARN_ON
> would be for the notifier to call disable_rpm_wakeref_asserts()
> and enable_rpm_wakeref_asserts() around the
> intel_uncore_forcewake_get(FORCEWAKE_ALL) call, would that be an
> acceptable fix ?

Yes, looks ok to me with a comment explaining the synchronization with
runtime suspend. The unregistration of the pmic notifier should happen
before vlv_suspend_complete() as Ville pointed out.
intel_uncore_suspend() could be moved before all the platform specific
suspend handlers as nothing afterwards needs forcewake.

--Imre

> 
> Note that we would then still have the
> assert_rpm_device_not_suspended() check implied by
> assert_rpm_wakelock_held() to ensure that we're
> not calling intel_uncore_forcewake_get during suspend.
> 
> Regards,
> 
> Hans
> 
> 
> 
> >>>[ 2383.844192] perf: interrupt took too long (2522 > 2500), lowering
> >>>kernel.perf_event_max_sample_rate to 79200
> >>>[ 2634.863978] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on
> >>>pipe A (start=157909 end=157910) time 322 us, min 1073, max 1079, scanline
> >>>start 1063, end 1084
> >>>[ 2647.881794] perf: interrupt took too long (3193 > 3152), lowering
> >>>kernel.perf_event_max_sample_rate to 62400
> >>>[ 3297.857921] perf: interrupt took too long (4020 > 3991), lowering
> >>>kernel.perf_event_max_sample_rate to 49500
> >>>[ 4670.977136] mmc0: Tuning timeout, falling back to fixed sampling clock
> >>>[ 4671.436604] mmc0: Tuning timeout, falling back to fixed sampling clock
> >>>[ 4680.756302] mmc0: Tuning timeout, falling back to fixed sampling clock
> >>>[ 4707.846872] perf: interrupt took too long (5046 > 5025), lowering
> >>>kernel.perf_event_max_sample_rate to 39600
> >>>[ 4846.672969] RPM wakelock ref not held during HW access
> >>>[ 4846.673050] ------------[ cut here ]------------
> >>>[ 4846.673084] WARNING: CPU: 0 PID: 5227 at drivers/gpu/drm/i915/intel_drv.h:
> >>>1780 intel_uncore_forcewake_get+0xa0/0xb0
> >>>[ 4846.673088] Modules linked in: snd_soc_sst_cht_bsw_nau8824 btusb btintel
> >>>bluetooth axp288_fuel_gauge ecdh_generic axp288_charger extcon_axp288
> >>>axp288_adc snd_hdmi_lpe_audio snd_intel_sst_acpi extcon_intel_int3496
> >>>snd_intel_sst_core extcon_core snd_soc_nau8824 snd_soc_sst_atom_hifi2_platform
> >>>snd_soc_core snd_compress snd_soc_sst_match snd_pcm snd_timer kxcjk_1013
> >>>industrialio_triggered_buffer intel_cht_int33fe snd soundcore intel_int0002
> >>>[ 4846.673201] CPU: 0 PID: 5227 Comm: kworker/0:1 Not tainted 4.12.0-rc1+ #3
> >>>[ 4846.673206] Hardware name: MEDION E2228T MD60250/NT16H, BIOS 5.11
> >>>02/27/2017
> >>>[ 4846.673224] Workqueue: events fuel_gauge_status_monitor [axp288_fuel_gauge]
> >>>[ 4846.673235] task: ffff8800a85be800 task.stack: ffffc90002610000
> >>>[ 4846.673248] RIP: 0010:intel_uncore_forcewake_get+0xa0/0xb0
> >>>[ 4846.673255] RSP: 0018:ffffc90002613aa0 EFLAGS: 00010286
> >>>[ 4846.673265] RAX: 000000000000002a RBX: ffff880136cc8000 RCX:
> >>>ffffffff82063e88
> >>>[ 4846.673272] RDX: 0000000000000000 RSI: 0000000000000082 RDI:
> >>>0000000000000247
> >>>[ 4846.673278] RBP: ffffc90002613ac0 R08: 000000000000002a R09:
> >>>00000000000002ac
> >>>[ 4846.673284] R10: 0000000000000001 R11: 0000000000000000 R12:
> >>>0000000000000007
> >>>[ 4846.673289] R13: 0000000000000001 R14: 0000000000000000 R15:
> >>>0000000000000000
> >>>[ 4846.673298] FS:  0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:
> >>>0000000000000000
> >>>[ 4846.673305] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>[ 4846.673311] CR2: 00000758c709c000 CR3: 0000000002009000 CR4:
> >>>00000000001006f0
> >>>[ 4846.673317] Call Trace:
> >>>[ 4846.673340]  i915_pmic_bus_access_notifier+0x37/0x40
> >>>[ 4846.673354]  notifier_call_chain+0x4a/0x70
> >>>[ 4846.673368]  __blocking_notifier_call_chain+0x47/0x60
> >>>[ 4846.673380]  blocking_notifier_call_chain+0x16/0x20
> >>>[ 4846.673393]  iosf_mbi_call_pmic_bus_access_notifier_chain+0x1b/0x20
> >>>[ 4846.673406]  baytrail_i2c_acquire+0x64/0x220
> >>>[ 4846.673420]  i2c_dw_acquire_lock+0x21/0x50
> >>>[ 4846.673431]  i2c_dw_xfer+0xa3/0x4a0
> >>>[ 4846.673444]  __i2c_transfer+0x115/0x430
> >>>[ 4846.673456]  i2c_transfer+0x5c/0xc0
> >>>[ 4846.673469]  regmap_i2c_read+0x6d/0xa0
> >>>[ 4846.673482]  _regmap_raw_read+0xda/0x210
> >>>[ 4846.673491]  ? _regmap_raw_read+0xda/0x210
> >>>[ 4846.673503]  ? __update_load_avg_se.isra.5+0x15b/0x180
> >>>[ 4846.673514]  _regmap_bus_read+0x2a/0x60
> >>>[ 4846.673524]  _regmap_read+0x6c/0x130
> >>>[ 4846.673535]  regmap_read+0x3f/0x60
> >>>[ 4846.673549]  fuel_gauge_reg_readb.isra.5+0x40/0x90 [axp288_fuel_gauge]
> >>>[ 4846.673564]  fuel_gauge_get_status+0x2d/0x100 [axp288_fuel_gauge]
> >>>[ 4846.673577]  ? __schedule+0x2e3/0x840
> >>>[ 4846.673590]  fuel_gauge_status_monitor+0x16/0x40 [axp288_fuel_gauge]
> >>>[ 4846.673603]  process_one_work+0x1e0/0x420
> >>>[ 4846.673616]  worker_thread+0x48/0x3f0
> >>>[ 4846.673629]  kthread+0x108/0x140
> >>>[ 4846.673640]  ? process_one_work+0x420/0x420
> >>>[ 4846.673650]  ? kthread_create_on_node+0x70/0x70
> >>>[ 4846.673662]  ret_from_fork+0x2c/0x40
> >>>[ 4846.673673] Code: 05 51 79 a1 00 01 e8 84 3c ae ff 0f ff eb a5 80 3d 40 79
> >>>a1 00 00 75 a6 48 c7 c7 98 fe e9 81 c6 05 30 79 a1 00 01 e8 64 3c ae ff <0f>
> >>>ff eb 8f 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
> >>>[ 4846.673896] ---[ end trace 4f0b6934a8fd8068 ]---
> >>>
> >>>Regards,
> >>>FKr
> >>>
> >>_______________________________________________
> >>Intel-gfx mailing list
> >>Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> >>https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux