all, On Sat, Jun 06, 2015 at 08:09:34PM -0700, Jeremiah Mahler wrote: > all, > > On all my machines with Intel graphics I get the following warning > in the logs when the machine is suspended. Apparently some part of > the graphics system is busy when it should be idle. This is present > on the latest linux-next 20150604. > > ... > [ 33.141747] Suspending console(s) (use no_console_suspend to debug) > [ 33.142146] wlan0: deauthenticating from 00:1a:70:5a:6e:0b by local > choice (Reason: 3=DEAUTH_LEAVING) > [ 33.147395] queueing ieee80211 work while going to suspend > [ 33.151597] cfg80211: Calling CRDA to update world regulatory domain > [ 33.190430] sd 0:0:0:0: [sda] Synchronizing SCSI cache > [ 33.190523] sd 0:0:0:0: [sda] Stopping disk > [ 33.275743] ------------[ cut here ]------------ > [ 33.275764] WARNING: CPU: 0 PID: 1617 at > drivers/gpu/drm/i915/i915_gem.c:4808 i915_gem_suspend+0xe4/0xf0 [i915]() > [ 33.275766] WARN_ON(dev_priv->mm.busy) > [ 33.275811] Modules linked in: binfmt_misc snd_hda_codec_hdmi > hid_generic isl29018(C) industrialio regmap_i2c cyapatp crc_itu_t usbhid > hid arc4 x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi > coretemp ath9k tpm_infineon kvm_intel kvm ath9k_common ath9k_hw > crct10dif_pclmul crc32_pclmul crc32c_intel chromeos_laptop ath mac80211 > ghash_clmulni_intel cryptd i915 cfg80211 pcspkr serio_raw sg ath3k btusb > btrtl lpc_ich snd_hda_codec_realtek shpchp i2c_i801 mfd_core > snd_hda_codec_generic btbcm btintel bluetooth snd_hda_intel battery > snd_hda_codec ac i2c_algo_bit drm_kms_helper tpm_tis snd_hwdep tpm > snd_hda_core drm snd_pcm video rfkill processor button snd_timer snd > soundcore i2c_designware_pci i2c_designware_core evdev uvcvideo > videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev > [ 33.275825] media i2c_core fuse autofs4 ext4 crc16 mbcache jbd2 > sd_mod fan xhci_pci sdhci_acpi sdhci xhci_hcd mmc_core thermal > thermal_sys usbcore ahci libahci usb_common libata scsi_mod > [ 33.275828] CPU: 0 PID: 1617 Comm: kworker/u4:4 Tainted: G C > 4.1.0-rc6-next-20150604+ #207 > [ 33.275829] Hardware name: Acer Peppy, BIOS 04/30/2014 > [ 33.275834] Workqueue: events_unbound async_run_entry_fn > [ 33.275838] 0000000000000000 ffffffffa05b7908 ffffffff8152ca4d > ffff880035effc58 > [ 33.275840] ffffffff8106bce1 ffff880073587f20 0000000000000000 > ffff880073580000 > [ 33.275842] ffff88003534f860 ffff880073580000 ffffffff8106bd5a > ffffffffa05c74c1 > [ 33.275843] Call Trace: > [ 33.275849] [<ffffffff8152ca4d>] ? dump_stack+0x40/0x50 > [ 33.275853] [<ffffffff8106bce1>] ? warn_slowpath_common+0x81/0xb0 > [ 33.275855] [<ffffffff8106bd5a>] ? warn_slowpath_fmt+0x4a/0x50 > [ 33.275865] [<ffffffffa0520264>] ? i915_gem_suspend+0xe4/0xf0 [i915] > [ 33.275872] [<ffffffffa04f48c1>] ? i915_drm_suspend+0x61/0x1b0 > [i915] > [ 33.275876] [<ffffffff81302381>] ? pci_pm_suspend+0x71/0x140 > [ 33.275878] [<ffffffff81302310>] ? pci_pm_freeze+0xd0/0xd0 > [ 33.275881] [<ffffffff813e1699>] ? dpm_run_callback+0x39/0xd0 > [ 33.275883] [<ffffffff813e20d4>] ? __device_suspend+0xe4/0x300 > [ 33.275884] [<ffffffff813e230e>] ? async_suspend+0x1e/0x90 > [ 33.275887] [<ffffffff8108a893>] ? async_run_entry_fn+0x43/0x150 > [ 33.275890] [<ffffffff810828f8>] ? process_one_work+0x148/0x3b0 > [ 33.275892] [<ffffffff81082e8a>] ? worker_thread+0x4a/0x440 > [ 33.275895] [<ffffffff81082e40>] ? rescuer_thread+0x2e0/0x2e0 > [ 33.275898] [<ffffffff81088121>] ? kthread+0xc1/0xe0 > [ 33.275901] [<ffffffff81088060>] ? > kthread_create_on_node+0x190/0x190 > [ 33.275904] [<ffffffff8153365f>] ? ret_from_fork+0x3f/0x70 > [ 33.275907] [<ffffffff81088060>] ? > kthread_create_on_node+0x190/0x190 > [ 33.275908] ---[ end trace e1c3eb5e163b3520 ]--- > [ 33.560558] PM: suspend of devices complete after 423.034 msecs > [ 33.577985] PM: late suspend of devices complete after 17.589 msecs > [ 33.579036] xhci_hcd 0000:00:14.0: System wakeup enabled by ACPI > [ 33.594059] PM: noirq suspend of devices complete after 16.226 msecs > [ 33.594498] ACPI: Preparing to enter system sleep state S3 > [ 33.595066] ACPI : EC: EC stopped > ... > > -- > - Jeremiah Mahler I bisected the kernel and found that the following patch introduced the bug. From b47161858ba13c9c7e03333132230d66e008dd55 Mon Sep 17 00:00:00 2001 From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Date: Mon, 27 Apr 2015 13:41:17 +0100 Subject: [PATCH] drm/i915: Implement inter-engine read-read optimisations MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently, we only track the last request globally across all engines. This prevents us from issuing concurrent read requests on e.g. the RCS and BCS engines (or more likely the render and media engines). Without semaphores, we incur costly stalls as we synchronise between rings - greatly impacting the current performance of Broadwell versus Haswell in certain workloads (like video decode). With the introduction of reference counted requests, it is much easier to track the last request per ring, as well as the last global write request so that we can optimise inter-engine read read requests (as well as better optimise certain CPU waits). v2: Fix inverted readonly condition for nonblocking waits. v3: Handle non-continguous engine array after waits v4: Rebase, tidy, rewrite ring list debugging v5: Use obj->active as a bitfield, it looks cool v6: Micro-optimise, mostly involving moving code around v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf) v8: Rebase v9: Refactor i915_gem_object_sync() to allow the compiler to better optimise it. Benchmark: igt/gem_read_read_speed hsw:gt3e (with semaphores): Before: Time to read-read 1024k: 275.794µs After: Time to read-read 1024k: 123.260µs hsw:gt3e (w/o semaphores): Before: Time to read-read 1024k: 230.433µs After: Time to read-read 1024k: 124.593µs bdw-u (w/o semaphores): Before After Time to read-read 1x1: 26.274µs 10.350µs Time to read-read 128x128: 40.097µs 21.366µs Time to read-read 256x256: 77.087µs 42.608µs Time to read-read 512x512: 281.999µs 181.155µs Time to read-read 1024x1024: 1196.141µs 1118.223µs Time to read-read 2048x2048: 5639.072µs 5225.837µs Time to read-read 4096x4096: 22401.662µs 21137.067µs Time to read-read 8192x8192: 89617.735µs 85637.681µs Testcase: igt/gem_concurrent_blit (read-read and friends) Cc: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxxxxxxxx> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> [v8] [danvet: s/\<rq\>/req/g] Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxx> --- drivers/gpu/drm/i915/i915_debugfs.c | 16 +- drivers/gpu/drm/i915/i915_drv.h | 19 +- drivers/gpu/drm/i915/i915_gem.c | 540 ++++++++++++++++++++------------ drivers/gpu/drm/i915/i915_gem_context.c | 2 - drivers/gpu/drm/i915/i915_gem_debug.c | 92 ++---- drivers/gpu/drm/i915/i915_gpu_error.c | 19 +- drivers/gpu/drm/i915/intel_display.c | 6 +- drivers/gpu/drm/i915/intel_lrc.c | 19 +- drivers/gpu/drm/i915/intel_overlay.c | 2 - drivers/gpu/drm/i915/intel_ringbuffer.c | 26 +- 10 files changed, 416 insertions(+), 325 deletions(-) ... -- - Jeremiah Mahler _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel