> Feb 26 06:01:45 ct523c-0b0b kernel: task:ip state:D stack:0 pid:28125 tgid:28125 ppid:3604 flags:0x00004002 > Feb 26 06:01:45 ct523c-0b0b kernel: Call Trace: > Feb 26 06:01:45 ct523c-0b0b kernel: <TASK> > Feb 26 06:01:45 ct523c-0b0b kernel: __schedule+0x42c/0xde0 > Feb 26 06:01:45 ct523c-0b0b kernel: schedule+0x3c/0x120 > Feb 26 06:01:45 ct523c-0b0b kernel: schedule_timeout+0x19c/0x1b0 > Feb 26 06:01:45 ct523c-0b0b kernel: ? mark_held_locks+0x49/0x70 > Feb 26 06:01:45 ct523c-0b0b kernel: __wait_for_common+0xba/0x1d0 > Feb 26 06:01:45 ct523c-0b0b kernel: ? usleep_range_state+0xb0/0xb0 > Feb 26 06:01:45 ct523c-0b0b kernel: remove_one+0x6b/0x100 Can you say where this remove_one+0x6b is? I feel it's probably this: if (!refcount_dec_and_test(&fsd->active_users)) { wait_for_completion(&fsd->active_users_drained); which ... looking at it, seems wrong? _Completely_ untested: diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c index 034a617cb1a5..fb636478c54d 100644 --- a/fs/debugfs/inode.c +++ b/fs/debugfs/inode.c @@ -751,13 +751,19 @@ static void __debugfs_file_removed(struct dentry *dentry) if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT) return; - /* if we hit zero, just wait for all to finish */ - if (!refcount_dec_and_test(&fsd->active_users)) { - wait_for_completion(&fsd->active_users_drained); - return; - } + /* + * Now that debugfs_file_get() no longer sees a valid entry, + * decrement the refcount to remove the initial reference. + */ + refcount_dec(&fsd->active_users); - /* if we didn't hit zero, try to cancel any we can */ + /* + * As long as it's not zero, try to cancel any cancellations, + * new incoming ones will wake up the completion as we might + * have raced: debugfs_file_get() had already been done, but + * debugfs_enter_cancellation() hadn't, by the time we got + * to this point here. + */ while (refcount_read(&fsd->active_users)) { struct debugfs_cancellation *c; johannes