Search Linux Wireless

Re: Kernel deadlock in 6.7.5 + hacks, maybe debugfs related.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/27/24 05:47, Johannes Berg wrote:

Feb 26 06:01:45 ct523c-0b0b kernel: task:ip              state:D stack:0     pid:28125 tgid:28125 ppid:3604   flags:0x00004002
Feb 26 06:01:45 ct523c-0b0b kernel: Call Trace:
Feb 26 06:01:45 ct523c-0b0b kernel:  <TASK>
Feb 26 06:01:45 ct523c-0b0b kernel:  __schedule+0x42c/0xde0
Feb 26 06:01:45 ct523c-0b0b kernel:  schedule+0x3c/0x120
Feb 26 06:01:45 ct523c-0b0b kernel:  schedule_timeout+0x19c/0x1b0
Feb 26 06:01:45 ct523c-0b0b kernel:  ? mark_held_locks+0x49/0x70
Feb 26 06:01:45 ct523c-0b0b kernel:  __wait_for_common+0xba/0x1d0
Feb 26 06:01:45 ct523c-0b0b kernel:  ? usleep_range_state+0xb0/0xb0
Feb 26 06:01:45 ct523c-0b0b kernel:  remove_one+0x6b/0x100

Can you say where this remove_one+0x6b is?

I feel it's probably this:

        if (!refcount_dec_and_test(&fsd->active_users)) {
                wait_for_completion(&fsd->active_users_drained);

which ... looking at it, seems wrong?


(gdb) l *(remove_one+0x6b)
0xffffffff815c257b is in remove_one (/home/greearb/git/linux-6.7.dev.y/fs/debugfs/inode.c:757).
752			return;
753	
754		/* if we hit zero, just wait for all to finish */
755		if (!refcount_dec_and_test(&fsd->active_users)) {
756			wait_for_completion(&fsd->active_users_drained);
757			return;
758		}
759	
760		/* if we didn't hit zero, try to cancel any we can */
761		while (refcount_read(&fsd->active_users)) {
(gdb)


_Completely_ untested:

We can test it.

Thanks,
Ben


diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 034a617cb1a5..fb636478c54d 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -751,13 +751,19 @@ static void __debugfs_file_removed(struct dentry *dentry)
  	if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)
  		return;
- /* if we hit zero, just wait for all to finish */
-	if (!refcount_dec_and_test(&fsd->active_users)) {
-		wait_for_completion(&fsd->active_users_drained);
-		return;
-	}
+	/*
+	 * Now that debugfs_file_get() no longer sees a valid entry,
+	 * decrement the refcount to remove the initial reference.
+	 */
+	refcount_dec(&fsd->active_users);
- /* if we didn't hit zero, try to cancel any we can */
+	/*
+	 * As long as it's not zero, try to cancel any cancellations,
+	 * new incoming ones will wake up the completion as we might
+	 * have raced: debugfs_file_get() had already been done, but
+	 * debugfs_enter_cancellation() hadn't, by the time we got
+	 * to this point here.
+	 */
  	while (refcount_read(&fsd->active_users)) {
  		struct debugfs_cancellation *c;


johannes


--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc  http://www.candelatech.com





[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux