Re: [PATCH 1/2] fcntl: fix potential deadlocks for &fown_struct.lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/7/21 7:44 pm, Jeff Layton wrote:
On Fri, 2021-07-02 at 17:18 +0800, Desmond Cheong Zhi Xi wrote:
Syzbot reports a potential deadlock in do_fcntl:

========================================================
WARNING: possible irq lock inversion dependency detected
5.12.0-syzkaller #0 Not tainted
--------------------------------------------------------
syz-executor132/8391 just changed the state of lock:
ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: f_getown_ex fs/fcntl.c:211 [inline]
ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: do_fcntl+0x8b4/0x1200 fs/fcntl.c:395
but this lock was taken by another, HARDIRQ-safe lock in the past:
  (&dev->event_lock){-...}-{2:2}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
Chain exists of:
   &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock

  Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&f->f_owner.lock);
                                local_irq_disable();
                                lock(&dev->event_lock);
                                lock(&new->fa_lock);
   <Interrupt>
     lock(&dev->event_lock);

  *** DEADLOCK ***

This happens because there is a lock hierarchy of
&dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
from the following call chain:

   input_inject_event():
     spin_lock_irqsave(&dev->event_lock,...);
     input_handle_event():
       input_pass_values():
         input_to_handler():
           evdev_events():
             evdev_pass_values():
               spin_lock(&client->buffer_lock);
               __pass_event():
                 kill_fasync():
                   kill_fasync_rcu():
                     read_lock(&fa->fa_lock);
                     send_sigio():
                       read_lock_irqsave(&fown->lock,...);

However, since &dev->event_lock is HARDIRQ-safe, interrupts have to be
disabled while grabbing &f->f_owner.lock, otherwise we invert the lock
hierarchy.

Hence, we replace calls to read_lock/read_unlock on &f->f_owner.lock,
with read_lock_irq/read_unlock_irq.


Patches look reasonable overall, but why does this one use read_lock_irq
and the other one use read_lock_irqsave? Don't we need to *_irqsasve in
both patches?



My thinking was that the functions f_getown_ex and f_getowner_uids are only called from do_fcntl, and f_getown is only called from do_fnctl and sock_ioctl. do_fnctl itself is only called from syscalls.

For sock_ioctl, the chain is
  compat_sock_ioctl():
    compat_sock_ioctl_trans():
      sock_ioctl()

For both paths, it doesn't seem that interrupts are disabled, so I used the *irq variants.

But of course, I might be very mistaken on this, and I'd be happy to make the change to *_irqsave.

Also, on further inspection, if these calls should be changed to *_irqsave, then I believe the call to write_lock_irq in f_modown (called from do_fcntl() --> f_setown() --> __f_setown() --> f_modown()) should also be changed to *_irqsave.

There's also a call to write_lock_irq(&fa->fa_lock) in fasync_remove_entry and fasync_insert_entry. Whether these should be changed as well isn't as clear to me, but since it's safe to do, perhaps it makes sense to use *_irqsave for them too. Thoughts?

Reported-and-tested-by: syzbot+e6d5398a02c516ce5e70@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@xxxxxxxxx>
---
  fs/fcntl.c | 13 +++++++------
  1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index dfc72f15be7f..cf9e81dfa615 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -150,7 +150,8 @@ void f_delown(struct file *filp)
  pid_t f_getown(struct file *filp)
  {
  	pid_t pid = 0;
-	read_lock(&filp->f_owner.lock);
+
+	read_lock_irq(&filp->f_owner.lock);
  	rcu_read_lock();
  	if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) {
  		pid = pid_vnr(filp->f_owner.pid);
@@ -158,7 +159,7 @@ pid_t f_getown(struct file *filp)
  			pid = -pid;
  	}
  	rcu_read_unlock();
-	read_unlock(&filp->f_owner.lock);
+	read_unlock_irq(&filp->f_owner.lock);
  	return pid;
  }
@@ -208,7 +209,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
  	struct f_owner_ex owner = {};
  	int ret = 0;
- read_lock(&filp->f_owner.lock);
+	read_lock_irq(&filp->f_owner.lock);
  	rcu_read_lock();
  	if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type))
  		owner.pid = pid_vnr(filp->f_owner.pid);
@@ -231,7 +232,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
  		ret = -EINVAL;
  		break;
  	}
-	read_unlock(&filp->f_owner.lock);
+	read_unlock_irq(&filp->f_owner.lock);
if (!ret) {
  		ret = copy_to_user(owner_p, &owner, sizeof(owner));
@@ -249,10 +250,10 @@ static int f_getowner_uids(struct file *filp, unsigned long arg)
  	uid_t src[2];
  	int err;
- read_lock(&filp->f_owner.lock);
+	read_lock_irq(&filp->f_owner.lock);
  	src[0] = from_kuid(user_ns, filp->f_owner.uid);
  	src[1] = from_kuid(user_ns, filp->f_owner.euid);
-	read_unlock(&filp->f_owner.lock);
+	read_unlock_irq(&filp->f_owner.lock);
err = put_user(src[0], &dst[0]);
  	err |= put_user(src[1], &dst[1]);





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux