On Tue, Apr 11, 2017 at 02:53:27PM +0800, Christian König wrote: > Am 11.04.2017 um 04:58 schrieb Huang Rui: > > This patch fixes the case when buffer funcs is empty and bo evict is > > executing. It must double check buffer funcs, otherwise, a NULL > > pointer dereference kernel panic will be encountered. > > > > BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4 > > IP: [<ffffffffa067b6cd>] amdgpu_evict_flags+0x3d/0xf0 [amdgpu] > > PGD 0 > > > > Oops: 0000 [#1] SMP > > Modules linked in: amdgpu(OE) ttm drm_kms_helper drm i2c_algo_bit > fb_sys_fops syscopyarea sysfillrect sysimgblt fmem(OE) physmem_drv(OE) > rpcsec_gss_krb5 nfsv4 nfs fscache intel_rapl x86_pkg_temp_thermal > intel_powerclamp snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic > kvm_intel snd_hda_intel snd_hda_codec kvm snd_hda_core joydev eeepc_wmi > asus_wmi sparse_keymap snd_hwdep snd_pcm irqbypass crct10dif_pclmul > snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq crc32_pclmul snd_seq_device > ghash_clmulni_intel aesni_intel aes_x86_64 snd_timer lrw gf128mul mei_me snd > glue_helper ablk_helper cryptd tpm_infineon mei lpc_ich serio_raw soundcore > shpchp mac_hid nfsd auth_rpcgss nfs_acl lockd grace coretemp sunrpc parport_pc > ppdev lp parport autofs4 hid_generic mxm_wmi r8169 usbhid ahci > > psmouse libahci nvme mii hid nvme_core wmi video > > CPU: 3 PID: 1627 Comm: kworker/u8:17 Tainted: G OE 4.9.0-custom > #1 > > Hardware name: ASUS All Series/Z87-A, BIOS 1802 01/28/2014 > > Workqueue: events_unbound async_run_entry_fn > > task: ffff88021e7057c0 task.stack: ffffc9000262c000 > > RIP: 0010:[<ffffffffa067b6cd>] [<ffffffffa067b6cd>] > amdgpu_evict_flags+0x3d/0xf0 [amdgpu] > > RSP: 0018:ffffc9000262fb30 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: ffff88021e8a5858 RCX: 0000000000000000 > > RDX: 0000000000000001 RSI: ffffc9000262fb58 RDI: ffff88021e8a5800 > > RBP: ffffc9000262fb48 R08: 0000000000000000 R09: ffff88021e8a5814 > > R10: 000000001def8f01 R11: ffff88021def8c80 R12: ffffc9000262fb58 > > R13: ffff88021d2b1990 R14: 0000000000000000 R15: ffff88021e8a5858 > > FS: 0000000000000000(0000) GS:ffff88022ed80000(0000) > knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000000000001a4 CR3: 0000000001c07000 CR4: 00000000001406e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Can we have the full stack trace please? > > Eviction should never occur before the buffer funcs are initialized, so > that patch just papers over some kind of race condition on startup as > far as I can see. > If we set ip_block_mask=0xff, sdma ip won't enable it. So funcs_ring is NULL at that time. Though it is a corner case, but we don't also expect it hang with kernel panic. I met it when I was debugging S3. Thanks, Rui