On 2/27/24 05:47, Johannes Berg wrote:
Feb 26 06:01:45 ct523c-0b0b kernel: task:ip state:D stack:0 pid:28125 tgid:28125 ppid:3604 flags:0x00004002
Feb 26 06:01:45 ct523c-0b0b kernel: Call Trace:
Feb 26 06:01:45 ct523c-0b0b kernel: <TASK>
Feb 26 06:01:45 ct523c-0b0b kernel: __schedule+0x42c/0xde0
Feb 26 06:01:45 ct523c-0b0b kernel: schedule+0x3c/0x120
Feb 26 06:01:45 ct523c-0b0b kernel: schedule_timeout+0x19c/0x1b0
Feb 26 06:01:45 ct523c-0b0b kernel: ? mark_held_locks+0x49/0x70
Feb 26 06:01:45 ct523c-0b0b kernel: __wait_for_common+0xba/0x1d0
Feb 26 06:01:45 ct523c-0b0b kernel: ? usleep_range_state+0xb0/0xb0
Feb 26 06:01:45 ct523c-0b0b kernel: remove_one+0x6b/0x100
Can you say where this remove_one+0x6b is?
I feel it's probably this:
if (!refcount_dec_and_test(&fsd->active_users)) {
wait_for_completion(&fsd->active_users_drained);
which ... looking at it, seems wrong?
(gdb) l *(remove_one+0x6b)
0xffffffff815c257b is in remove_one (/home/greearb/git/linux-6.7.dev.y/fs/debugfs/inode.c:757).
752 return;
753
754 /* if we hit zero, just wait for all to finish */
755 if (!refcount_dec_and_test(&fsd->active_users)) {
756 wait_for_completion(&fsd->active_users_drained);
757 return;
758 }
759
760 /* if we didn't hit zero, try to cancel any we can */
761 while (refcount_read(&fsd->active_users)) {
(gdb)
_Completely_ untested:
We can test it.
Thanks,
Ben
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 034a617cb1a5..fb636478c54d 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -751,13 +751,19 @@ static void __debugfs_file_removed(struct dentry *dentry)
if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)
return;
- /* if we hit zero, just wait for all to finish */
- if (!refcount_dec_and_test(&fsd->active_users)) {
- wait_for_completion(&fsd->active_users_drained);
- return;
- }
+ /*
+ * Now that debugfs_file_get() no longer sees a valid entry,
+ * decrement the refcount to remove the initial reference.
+ */
+ refcount_dec(&fsd->active_users);
- /* if we didn't hit zero, try to cancel any we can */
+ /*
+ * As long as it's not zero, try to cancel any cancellations,
+ * new incoming ones will wake up the completion as we might
+ * have raced: debugfs_file_get() had already been done, but
+ * debugfs_enter_cancellation() hadn't, by the time we got
+ * to this point here.
+ */
while (refcount_read(&fsd->active_users)) {
struct debugfs_cancellation *c;
johannes
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com