On Thu, Apr 19, 2012 at 06:38:13PM +0200, Takashi Iwai wrote: > At Thu, 19 Apr 2012 18:29:47 +0200, > Daniel Vetter wrote: > > > > On Thu, Apr 19, 2012 at 06:10:18PM +0200, Takashi Iwai wrote: > > > The hotplug work can be still kicked off via irq during PM, and this > > > may conflict with the resume procedure. For example, eDP on SNB > > > machine shows WARNING like below during the resume: > > > > > > WARNING: at /usr/src/packages/BUILD/kernel-default-3.0.13/linux-3.0/drivers/gpu/drm/i915/intel_dp.c:332 intel_dp_check_edp+0x73/0xd0 [i915]() > > > Hardware name: HP Z1 Workstation > > > eDP powered off while attempting aux channel communication. > > > Supported: Yes > > > Pid: 3210, comm: kworker/u:49 Tainted: G C N 3.0.13-0.27-default #1 > > > Call Trace: > > > [<ffffffff810048b5>] dump_trace+0x75/0x300 > > > [<ffffffff8143ea0f>] dump_stack+0x69/0x6f > > > [<ffffffff81059e2b>] warn_slowpath_common+0x7b/0xc0 > > > [<ffffffff81059f25>] warn_slowpath_fmt+0x45/0x50 > > > [<ffffffffa01fa9f3>] intel_dp_check_edp+0x73/0xd0 [i915] > > > [<ffffffffa01fae4b>] intel_dp_aux_native_write+0x1b/0xe0 [i915] > > > [<ffffffffa01fb033>] intel_dp_set_link_train+0x73/0xa0 [i915] > > > [<ffffffffa01fb58e>] intel_dp_start_link_train+0x16e/0x400 [i915] > > > [<ffffffffa01fbc6c>] intel_dp_complete_link_train+0x1fc/0x3d0 [i915] > > > [<ffffffffa01fcf4c>] intel_dp_check_link_status+0x12c/0x1d0 [i915] > > > [<ffffffffa01cf22e>] i915_hotplug_work_func+0x6e/0xa0 [i915] > > > [<ffffffff810747bc>] process_one_work+0x16c/0x350 > > > [<ffffffff8107734a>] worker_thread+0x17a/0x410 > > > [<ffffffff8107b676>] kthread+0x96/0xa0 > > > [<ffffffff8144a7c4>] kernel_thread_helper+0x4/0x10 > > > DWARF2 unwinder stuck at kernel_thread_helper+0x4/0x10 > > > > > > This patch adds a flag to disable the hotplug during PM operation for > > > avoiding such a race. > > > > > > Cc: <stable at kernel.org> > > > Signed-off-by: Takashi Iwai <tiwai at suse.de> > > > > I haven't looked to closely, but isn't cancelling the hotplug work after > > we disable the irqs in the suspend path good enough? This here feels a bit > > like ducttapeing over the problem. > > This doesn't look like a leftover work. Judging from the log I got, > the hotplug event is kicked really from the irq handler in the resume > phase. In that case I guess we have a setup ordering issue on the resume side. If we enable irqs before everything is set up again, we will fail because the resume path doesn't grab any locks ... > [ 53.424757] ehci_hcd 0000:00:1d.0: cache line size of 64 is not supported > [ 53.452721] [drm:intel_enable_rc6], Sandybridge: RC6 disabled > [ 53.477048] firewire_core: skipped bus generations, destroying all nodes > [ 53.504839] [drm:intel_opregion_setup], graphic opregion physical addr: 0x73d1c018 > [ 53.504929] [drm:intel_opregion_setup], Public ACPI methods supported > [ 53.504930] [drm:intel_opregion_setup], SWSCI supported > [ 53.504931] [drm:intel_opregion_setup], ASLE supported > [ 53.504962] [drm:init_status_page], render ring hws offset: 0x00000000 > [ 53.505047] [drm:init_status_page], gen6 bsd ring hws offset: 0x00022000 > [ 53.505118] [drm:init_status_page], blt ring hws offset: 0x00043000 > [ 53.505185] [drm:ironlake_init_pch_refclk], has_panel 1 has_lvds 0 has_pch_edp 1 has_cpu_edp 0 has_ck505 0 > [ 53.505188] [drm:ironlake_init_pch_refclk], Using SSC on panel > [ 53.505611] [drm:intel_dp_mode_fixup], Display port link bw 0a lane count 4 clock 270000 > [ 53.505613] [drm:drm_crtc_helper_set_mode], [CRTC:3] > [ 53.505615] [drm:ironlake_edp_backlight_off], > [ 53.512832] [drm:ironlake_edp_panel_off], Turn eDP power off > [ 53.512836] [drm:ironlake_wait_panel_off], Wait for panel power off time > [ 53.512839] [drm:ironlake_wait_panel_status], mask b000000f value 00000000 status c0000008 control abcd0000 > [ 53.652153] [drm:pch_irq_handler], PCH HDCP audio interrupt > [ 53.652159] [drm:i915_hotplug_work_func], running encoder hotplug functions > [ 53.653780] [drm:intel_dp_check_link_status], TMDS-6: channel EQ not ok, retraining > [ 53.654188] [drm:intel_dp_start_link_train], training pattern 1 signal levels 00000000 > .... > [ 53.680513] [drm:intel_dp_start_link_train], too many full retries, give up > ... > [ 53.714791] [drm:intel_dp_start_link_train], training pattern 1 signal levels 06000000 > [ 53.715198] ------------[ cut here ]------------ > [ 53.715230] WARNING: at /usr/src/packages/BUILD/kernel-default-3.0.13/linux-3.0/drivers/gpu/drm/i915/intel_dp.c:332 intel_dp_check_edp+0x73/0xd0 [i915]() > ... > > > I'm asking because we seem to have other problems with work queue items > > that leak across s/r and cause havoc on resume. So extracting this > > quiescenting code from module unload and also running it at suspend time > > sounds more like the right thing. > > Yes, this should be needed anyway, I think. But the new hotplug event > can be still generated, as it seems, and this would conflict with the > initialization beind done in the resume. Yeah, we have a few holes to plug in s/r ... -Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48