On 4/22/2015 12:36 AM, Linus Torvalds wrote:
So I just go the appended NULL pointer de-reference when trying to
look at a video from my GoPro.
The code disassembles to
0: 81 fb 00 04 00 00 cmp $0x400,%ebx
6: 41 89 07 mov %eax,(%r15)
9: 74 78 je 0x83
b: 48 8d 7c 24 18 lea 0x18(%rsp),%rdi
10: e8 6e b3 1b c1 callq 0xffffffffc11bb383
15: 84 c0 test %al,%al
17: 74 4a je 0x63
19: 48 85 ed test %rbp,%rbp
1c: 75 b5 jne 0xffffffffffffffd3
1e: 48 8b 04 24 mov (%rsp),%rax
22: 49 8b 84 c4 98 01 00 mov 0x198(%r12,%rax,8),%rax
29: 00
2a:* 48 8b 28 mov (%rax),%rbp <-- trapping instruction
2d: 65 ff 05 1f e8 ef 3f incl %gs:0x3fefe81f(%rip) # 0x3fefe853
34: 48 b8 00 00 00 00 00 movabs $0x160000000000,%rax
3b: 16 00 00
which matches up with the asm code
cmpl $1024, %ebx #, act_pte
movl %eax, (%r15) # D.49217, *_26
je .L118 #,
.L110:
leaq 24(%rsp), %rdi #, tmp156
call __sg_page_iter_next #
testb %al, %al # D.49219
je .L119 #,
testq %rbp, %rbp # pt_vaddr
jne .L109 #,
movq (%rsp), %rax # %sfp, act_pt
movq 408(%r12,%rax,8), %rax # MEM[(struct i915_hw_ppgtt
*)vm_8(D)].D.36998.pd.page
movq (%rax), %rbp # _21->page, D.49221
#APP
# 72 "./arch/x86/include/asm/preempt.h" 1
incl %gs:__preempt_count(%rip) # __preempt_count
# 0 "" 2
#NO_APP
movabsq $24189255811072, %rax #, tmp150
which in turn seems to come from the C code
pt_vaddr =
kmap_atomic(ppgtt->pd.page_table[act_pt]->page);
(that "testq %rbp,%rbp; jne" just before the oopsing instruction group
is that "if (pt_vaddr == NULL)" test.
IOW, it looks like
ppgtt->pd.page_table[act_pt]
is NULL, and then trying to dereference ->page off of it is what
oopses (the preempt-count increment that comes after is the
"pagefault_disable()" in kmap_atomic, and the big constant we're
loading into %rax is part of "page_address(page)").
I have no idea why "ppgtt->pd.page_table[act_pt]" would be NULL, but
clearly it can be. Can somebody who knows this code look into it. I've
added a few people who have worked in this area recently, in addition
to the usual maintainer list..
Thanks,
Linus
---
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffc010c137>] gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: rfcomm fuse cmac ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4
nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_nat ebtable_broute
bridge stp llc ebtable_filter ebtables ip6table_mangle
ip6table_security ip6table_raw ip6table_filter ip6_tables
iptable_mangle iptable_security iptable_raw bnep arc4 vfat fat
x86_pkg_temp_thermal pn544_mei mei_phy coretemp pn544 hci nfc
kvm_intel iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek
snd_hda_codec_hdmi kvm snd_hda_codec_generic uvcvideo
videobuf2_vmalloc videobuf2_memops microcode videobuf2_core
snd_hda_intel v4l2_common hid_multitouch snd_hda_controller videodev
btusb snd_hda_codec iwlmvm media snd_hwdep mac80211 btbcm snd_seq
btintel bluetooth snd_seq_device joydev snd_pcm serio_raw
i2c_i801 iwlwifi cfg80211 snd_hda_core sony_laptop snd_timer snd
rfkill mei_me soundcore lpc_ich shpchp mei mfd_core dm_crypt
crct10dif_pclmul i915 crc32_pclmul crc32c_intel i2c_algo_bit
drm_kms_helper ghash_clmulni_intel drm i2c_core video
CPU: 1 PID: 2697 Comm: chrome Not tainted 4.0.0-09362-g1fc149933fd4 #8
Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013
task: ffff88010dc51b30 ti: ffff88003f328000 task.ti: ffff88003f328000
RIP: 0010:[<ffffffffc010c137>] [<ffffffffc010c137>]
gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
RSP: 0018:ffff88003f32b9a8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000075b1b
RDX: ffff88007d848990 RSI: 0000000000000001 RDI: ffff88003f32b9c0
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88003f6f7e58
R10: 000000000d836000 R11: 0000000000000000 R12: ffff8800d4164000
R13: 0000000000000000 R14: 0000000000000001 R15: ffff88003f7bbffc
FS: 00007f7f0ee94a00(0000) GS:ffff88011fa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000005f607000 CR4: 00000000001407e0
Stack:
0000000000000201 000002010dc51b30 0000000000000000 ffff88007d848990
0000004000075b1b ffff880100000001 0000000000000fe0 ffff880011215900
0000000000000000 ffff88006cc4c380 ffff88003f6f0000 0000000000000001
Call Trace:
ggtt_bind_vma+0x97/0x110 [i915]
i915_vma_bind+0x40/0x410 [i915]
swiotlb_map_sg_attrs+0x74/0x140
i915_gem_object_do_pin+0x864/0x9f0 [i915]
mutex_lock+0x9/0x30
i915_gem_execbuffer_reserve_vma.isra.20+0x66/0x130 [i915]
i915_gem_execbuffer_reserve+0x2ec/0x320 [i915]
i915_gem_do_execbuffer.isra.27+0x5ee/0xf80 [i915]
mutex_optimistic_spin+0x16e/0x1f0
__mutex_lock_interruptible_slowpath+0x21/0x130
shmem_fault+0x57/0x1c0
drm_gem_object_lookup+0x14/0xa0 [drm]
i915_gem_execbuffer2+0xb2/0x2a0 [i915]
drm_ioctl+0x15a/0x580 [drm]
current_fs_time+0x9/0x50
do_vfs_ioctl+0x2e8/0x4f0
file_has_perm+0x77/0x80
syscall_trace_enter_phase1+0x116/0x140
SyS_ioctl+0x79/0x90
system_call_fastpath+0x12/0x6a
Code: 00 81 fb 00 04 00 00 41 89 07 74 78 48 8d 7c 24 18 e8 6e b3 1b
c1 84 c0 74 4a 48 85 ed 75 b5 48 8b 04 24 49 8b 84 c4 98 01 00 00 <48>
8b 28 65 ff 05 1f e8 ef 3f 48 b8 00 00 00 00 00 16 00 00 48
RIP gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
RSP <ffff88003f32b9a8>
CR2: 0000000000000000
Hi,
I see a possible va re-allocation that could be the culprit, but the
change was commited just 2 days ago
(http://cgit.freedesktop.org/drm-intel/commit/?id=5c5f645773b6d147bf68c350674dc3ef4f8de83d).
-Michel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx