On Tue, Jun 30, 2020 at 10:14:25AM +0100, Chris Wilson wrote: [ ... ] > > > @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m, > > > struct task_struct *task; > > > char name[80]; > > > > > > - spin_lock(&file->table_lock); > > > + rcu_read_lock(); > > > idr_for_each(&file->object_idr, per_file_stats, &stats); > > > - spin_unlock(&file->table_lock); > > > + rcu_read_unlock(); > > > > > For my education - is it indeed possible and valid to replace spin_lock() > > with rcu_read_lock() to prevent list manipulation for a list used by > > idr_for_each(), even if that list is otherwise manipulated under the > > spinlock ? > > It's a pure read of a radixtree here, and is supposed to be RCU safe: > > * idr_for_each() can be called concurrently with idr_alloc() and > * idr_remove() if protected by RCU. Newly added entries may not be > * seen and deleted entries may be seen, but adding and removing entries > * will not cause other entries to be skipped, nor spurious ones to be seen. > > That is the tree structure is stable. > Ah, that makes sense. Thanks for the clarification. > > Background: we are seeing a crash with the following call trace. > > > > [ 1016.651593] BUG: kernel NULL pointer dereference, address: 0000000000000000 > > ... > > [ 1016.651693] Call Trace: > > [ 1016.651703] idr_for_each+0x8a/0xe8 > > [ 1016.651711] i915_gem_object_info+0x2a3/0x3eb > > [ 1016.651720] seq_read+0x162/0x3ca > > [ 1016.651727] full_proxy_read+0x5b/0x8d > > [ 1016.651733] __vfs_read+0x45/0x1bb > > [ 1016.651741] vfs_read+0xc9/0x15e > > [ 1016.651746] ksys_read+0x7e/0xde > > [ 1016.651752] do_syscall_64+0x54/0x68 > > [ 1016.651758] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > Actually, the crash is not in idr_for_each, but in per_file_stats: [ 1016.651637] RIP: 0010:per_file_stats+0xe/0x16e [ 1016.651646] Code: d2 41 0f b6 8e 69 8c 00 00 48 89 df 48 c7 c6 7b 74 8c be 31 c0 e8 0c 89 cf ff eb d2 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 <8b> 06 85 c0 0f 84 4d 01 00 00 49 89 d6 48 89 f3 3d ff ff ff 7f 73 [ 1016.651651] RSP: 0018:ffffad3a01337ba0 EFLAGS: 00010293 [ 1016.651656] RAX: 0000000000000018 RBX: ffff96fe040d65e0 RCX: 0000000000000002 [ 1016.651660] RDX: ffffad3a01337c50 RSI: 0000000000000000 RDI: 00000000000001e8 [ 1016.651663] RBP: ffffad3a01337bb8 R08: 0000000000000000 R09: 00000000000001c0 [ 1016.651667] R10: 0000000000000000 R11: ffffffffbdbe5fce R12: 0000000000000000 [ 1016.651671] R13: ffffffffbdbe5fce R14: ffffad3a01337c50 R15: 0000000000000001 [ 1016.651676] FS: 00007a597e2d7480(0000) GS:ffff96ff3bb00000(0000) knlGS:0000000000000000 [ 1016.651680] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1016.651683] CR2: 0000000000000000 CR3: 0000000171fc2001 CR4: 00000000003606e0 [ 1016.651687] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1016.651690] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1016.651693] Call Trace: Sorry for the confusion. From the context it appears that the second parameter of per_file_stats() may be NULL, though I am not entirely sure if I got that correctly. > Is there a reason you are using this slow debugfs in the first place? AFAICS ChromeOS is using the information to calculate graphics memory use. Guenter _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx