Guys, I've gotten absolutely no response to this, and the problem seems to still occur. I just got a slightly different hang at shutdown, due to a kernel oops that seems related. It's not identical - the call trace is very different - but it's close. In particular, it's once again the same NULL pointer dereference in "intel_unpin_fb_obj()", except this time it looked like this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 IP: intel_unpin_fb_obj+0x69/0xe0 [i915] Oops: 0000 [#1] SMP Modules linked in: fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6ta$ tpm_tis industrialio tpm_tis_core acpi_pad tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt hid_logitech_hidpp hid_logitech_dj i915 crct10dif_pclmul i2c_algo_bit crc32_pc$ CPU: 4 PID: 26173 Comm: kworker/u16:9 Tainted: G W 4.10.0-rc5-00111-g49e555a932de #1 Hardware name: System manufacturer System Product Name/Z170-K, BIOS 1803 05/06/2016 Workqueue: i915 intel_unpin_work_fn [i915] RIP: 0010:intel_unpin_fb_obj+0x69/0xe0 [i915] RSP: 0000:ffffb95c4937bdc0 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff96f284441340 RCX: 0000000000000000 RDX: ffffb95c4937bdc0 RSI: ffff96f29f273908 RDI: ffff96f284441340 RBP: ffffb95c4937be08 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000fa83b2da R11: 0000000000808111 R12: ffff96f20d878500 R13: 0000000000000001 R14: ffff96f29f58c400 R15: ffff96f29f270068 FS: 0000000000000000(0000) GS:ffff96f2b6d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000078 CR3: 000000041ff4b000 CR4: 00000000003406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: intel_unpin_work_fn+0x58/0x140 [i915] process_one_work+0x1f1/0x480 worker_thread+0x48/0x4d0 kthread+0x101/0x140 ret_from_fork+0x29/0x40 Code: ff ff ff 74 67 48 8d 7d b8 44 89 ea 4c 89 e6 e8 ce 2c ff ff 48 8b 43 08 48 8d 55 b8 48 89 df 48 8d b0 08 39 00 00 e8 47 1b fc ff <48> 8b 50 78 48 85 d2 74 04 83 6a 20 01 48 $ RIP: intel_unpin_fb_obj+0x69/0xe0 [i915] RSP: ffffb95c4937bdc0 CR2: 0000000000000078 ---[ end trace afab57e9d299b42b ]--- so this time it was the worker thread that died and took the system down with it. Anyway, there is something *seriously* wrong with the i915 shutdown sequence. Now, maybe this was fixed with the recent drm pull that did have some i915 fixes in it, and I wasn't running on my desktop yet, but nothing there looks very obvious. And once again, I'd like to note that other users of i915_gem_object_to_ggtt() do seem to check for a NULL vma, while intel_unpin_fb_obj() simply passes any potential NULL vma to i915_vma_unpin_fence(). Guys? Linus On Sun, Jan 8, 2017 at 3:35 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > This has so far only happened once, so I don't know how repeatable it > is, but here goes.. > > My nice stable XPS13 just oopsed on shutdown. It is possibly related > to the X server SIGSEGV'ing too, although honestly, I am not sure > which caused which. Maybe the kernel oops caused the X problem. They > definitely happened together, and happened as I was shutting down the > machine. > > I'm including the syslog for the Xorg issue too, in case it ends up > giving people ideas, but the kernel oops is what I actually looked at. > The code decodes to > > 74 67 je 0x69 > 48 8d 7d b8 lea -0x48(%rbp),%rdi > 44 89 ea mov %r13d,%edx > 4c 89 e6 mov %r12,%rsi > e8 3e 2d ff ff callq .. > 48 8b 43 08 mov 0x8(%rbx),%rax > 48 8d 55 b8 lea -0x48(%rbp),%rdx > 48 89 df mov %rbx,%rdi > 48 8d b0 08 39 00 00 lea 0x3908(%rax),%rsi > e8 47 1a fc ff callq .. > * 48 8b 50 78 mov 0x78(%rax),%rdx <-- > trapping instruction > 48 85 d2 test %rdx,%rdx > 74 04 je 0x35 > 83 6a 20 01 subl $0x1,0x20(%rdx) > 48 89 c7 mov %rax,%rdi > e8 c2 60 fc ff callq .. > > > and just comparing it to the generted code it seems to be this: > > call i915_gem_obj_to_vma # > movq 120(%rax), %rdx # MEM[(struct drm_i915_fence_reg * > *)_24 + 120B], _15 > > where %rax (the return value from i915_gem_obj_to_vma()) is NULL. > > So it seems to be this code: > > ... > vma = i915_gem_object_to_ggtt(obj, &view); > > i915_vma_unpin_fence(vma); > i915_gem_object_unpin_from_display_plane(vma); > ... > > where vma is NULL. > > The other user of i915_gem_object_to_ggtt() does have a test of !vma, > although with a warning. Which implies it does happen, but shouldn't. > Maybe consistent with the Xorg confusion? > > Linus > > --- > > gdm-x-session: (II) UnloadModule: "libinput" > gdm-x-session: (II) systemd-logind: releasing fd for 13:72 > gdm-x-session: (II) UnloadModule: "libinput" > gdm-x-session: (II) systemd-logind: releasing fd for 13:78 > gdm-x-session: (II) UnloadModule: "libinput" > gdm-x-session: (II) systemd-logind: releasing fd for 13:66 > gdm-x-session: (II) UnloadModule: "libinput" > gdm-x-session: (II) systemd-logind: releasing fd for 13:65 > gdm-x-session: (II) UnloadModule: "libinput" > gdm-x-session: (II) systemd-logind: releasing fd for 13:69 > gdm-x-session: (II) UnloadModule: "libinput" > gdm-x-session: (II) systemd-logind: releasing fd for 13:67 > gdm-x-session: (EE) > gdm-x-session: (EE) Backtrace: > gdm-x-session: (EE) 0: /usr/libexec/Xorg (OsLookupColor+0x139) [0x59f859] > gdm-x-session: (EE) 1: /lib64/libc.so.6 (__restore_rt+0x0) [0x7fe554e5a7df] > gdm-x-session: (EE) 2: /usr/lib64/xorg/modules/libfb.so > (_fbGetWindowPixmap+0xd) [0x7fe54d16b6fd] > gdm-x-session: (EE) 3: /usr/libexec/Xorg > (present_extension_init+0x5b7) [0x51b9b7] > gdm-x-session: (EE) 4: /usr/libexec/Xorg > (present_extension_init+0x685) [0x51bb95] > gdm-x-session: (EE) 5: /usr/libexec/Xorg > (present_extension_init+0xdf2) [0x51ca62] > gdm-x-session: (EE) 6: /usr/libexec/Xorg (AddTraps+0x9133) [0x523973] > gdm-x-session: (EE) 7: /usr/libexec/Xorg > (CompositeRegisterImplicitRedirectionException+0x4098) [0x4ccf58] > gdm-x-session: (EE) 8: /usr/libexec/Xorg (AddTraps+0x73f4) [0x51fe84] > gdm-x-session: (EE) 9: /usr/libexec/Xorg (remove_fs_handlers+0x581) [0x43af61] > gdm-x-session: (EE) 10: /lib64/libc.so.6 (__libc_start_main+0xf1) > [0x7fe554e46731] > gdm-x-session: (EE) 11: /usr/libexec/Xorg (_start+0x29) [0x424d59] > gdm-x-session: (EE) 12: ? (?+0x29) [0x29] > gdm-x-session: (EE) > gdm-x-session: (EE) Segmentation fault at address 0x10 > gdm-x-session: (EE) > gdm-x-session: Fatal server error: > gdm-x-session: (EE) Caught signal 11 (Segmentation fault). Server aborting > gdm-x-session: (EE) > gdm-x-session: (EE) > gdm-x-session: Please consult the Fedora Project support > gdm-x-session: at http://wiki.x.org > gdm-x-session: for help. > gdm-x-session: (EE) Please also check the log file at > "/home/torvalds/.local/share/xorg/Xorg.0.log" for additional > information. > gdm-x-session: (EE) > gdm-x-session: (WW) xf86CloseConsole: KDSETMODE failed: Input/output error > gdm-x-session: (WW) xf86CloseConsole: VT_GETMODE failed: Input/output error > gdm-x-session: (WW) xf86CloseConsole: VT_ACTIVATE failed: Input/output error > > kernel: BUG: unable to handle kernel NULL pointer dereference at > 0000000000000078 > IP: intel_unpin_fb_obj+0x69/0xe0 [i915] > PGD 0 > Oops: 0000 [#1] SMP > Modules linked in: rfcomm fuse ccm ip6t_rpfilter ip6t_REJECT > nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat > ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 > nf_defrag_ipv6 nf_nat_ipv6 ip6table_security ip6table_mangle > ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 > nf_nat nf_conntrack iptable_security iptable_mangle iptable_raw > ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep vfat fat > arc4 snd_hda_codec_hdmi dell_led snd_soc_skl intel_rapl iTCO_wdt > snd_soc_skl_ipc x86_pkg_temp_thermal intel_powerclamp snd_soc_sst_ipc > snd_hda_codec_realtek coretemp snd_hda_codec_generic snd_soc_sst_dsp > snd_hda_ext_core snd_soc_sst_match snd_soc_core > i2c_designware_platform i2c_designware_core kvm_intel iwlmvm dell_wmi > snd_hda_intel kvm snd_hda_codec > snd_hwdep mac80211 snd_hda_core snd_seq irqbypass snd_seq_device > intel_cstate dell_laptop intel_rapl_perf dell_smbios snd_pcm dcdbas > iwlwifi rtsx_pci_ms snd_timer memstick snd cfg80211 soundcore i2c_i801 > joydev shpchp btusb btrtl mei_me idma64 processor_thermal_device mei > intel_lpss_pci intel_soc_dts_iosf intel_pch_thermal wmi hci_uart btbcm > btqca btintel bluetooth acpi_als pinctrl_sunrisepoint kfifo_buf > intel_lpss_acpi pinctrl_intel rfkill int3403_thermal industrialio > intel_lpss int340x_thermal_zone acpi_pad intel_hid tpm_tis > int3400_thermal tpm_tis_core acpi_thermal_rel sparse_keymap tpm nfsd > auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt hid_multitouch > rtsx_pci_sdmmc mmc_core crct10dif_pclmul i915 crc32_pclmul > crc32c_intel ghash_clmulni_intel i2c_algo_bit serio_raw drm_kms_helper > syscopyarea nvme sysfillrect nvme_core rtsx_pci sysimgblt > fb_sys_fops drm i2c_hid video fjes > CPU: 0 PID: 5083 Comm: systemd-logind Not tainted > 4.10.0-rc2-00103-g4cf184638bcf #38 > Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.12 11/30/2016 > task: ffff8d8fe8af8000 task.stack: ffffb5e4c2388000 > RIP: 0010:intel_unpin_fb_obj+0x69/0xe0 [i915] > RSP: 0018:ffffb5e4c238b7e0 EFLAGS: 00010282 > RAX: 0000000000000000 RBX: ffff8d8fab64e100 RCX: ffff8d8fab64e101 > RDX: ffffb5e4c238b7e0 RSI: ffff8d8fe77eb908 RDI: ffff8d8fab64e100 > RBP: ffffb5e4c238b828 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000007 R11: 00000000000000bf R12: ffff8d8fc64d5900 > R13: 0000000000000001 R14: ffff8d8fe7f6b540 R15: ffff8d8f9c6d6c00 > FS: 00007f7f18786900(0000) GS:ffff8d8ffec00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000078 CR3: 000000046a72f000 CR4: 00000000003406f0 > Call Trace: > intel_cleanup_plane_fb+0x5b/0xa0 [i915] > drm_atomic_helper_cleanup_planes+0x6f/0x90 [drm_kms_helper] > intel_atomic_commit_tail+0x749/0xfe0 [i915] > intel_atomic_commit+0x3cb/0x4f0 [i915] > drm_atomic_commit+0x4b/0x50 [drm] > restore_fbdev_mode+0x14c/0x2a0 [drm_kms_helper] > drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper] > drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper] > intel_fbdev_set_par+0x18/0x70 [i915] > fb_set_var+0x236/0x460 > fbcon_blank+0x30f/0x350 > do_unblank_screen+0xd2/0x1a0 > vt_ioctl+0x507/0x12a0 > tty_ioctl+0x355/0xc30 > do_vfs_ioctl+0xa3/0x5e0 > SyS_ioctl+0x79/0x90 > entry_SYSCALL_64_fastpath+0x13/0x94 > RIP: 0033:0x7f7f17850ce7 > RSP: 002b:00007ffe696d9bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 000000000000001a RCX: 00007f7f17850ce7 > RDX: 0000000000000000 RSI: 0000000000004b3a RDI: 0000000000000015 > RBP: 00007f7f187866c8 R08: 00000016170f1200 R09: 0000000000000009 > R10: 0000000000000075 R11: 0000000000000246 R12: 0000000000000000 > R13: 0000000000000001 R14: 000055f66b267790 R15: 000055f66b25e190 > Code: ff ff ff 74 67 48 8d 7d b8 44 89 ea 4c 89 e6 e8 3e 2d ff ff > 48 8b 43 08 48 8d 55 b8 48 89 df 48 8d b0 08 39 00 00 e8 47 1a fc ff > <48> 8b 50 78 48 85 d2 74 04 83 6a 20 01 48 89 c7 e8 c2 60 fc ff > RIP: intel_unpin_fb_obj+0x69/0xe0 [i915] RSP: ffffb5e4c238b7e0 > CR2: 0000000000000078 > ---[ end trace daf415d61b7a5042 ]--- _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx