+ epoll-do-not-take-the-nested-ep-mtx-on-epoll_ctl_del.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Subject: + epoll-do-not-take-the-nested-ep-mtx-on-epoll_ctl_del.patch added to -mm tree
To: jbaron@xxxxxxxxxx,davidel@xxxxxxxxxxxxxxx,nelhage@xxxxxxxxxxx,normalperson@xxxxxxxx,nzimmer@xxxxxxx,paulmck@xxxxxxxxxx,viro@xxxxxxxxxxxxxxxxxx
From: akpm@xxxxxxxxxxxxxxxxxxxx
Date: Thu, 26 Dec 2013 14:19:21 -0800


The patch titled
     Subject: epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL
has been added to the -mm tree.  Its filename is
     epoll-do-not-take-the-nested-ep-mtx-on-epoll_ctl_del.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/epoll-do-not-take-the-nested-ep-mtx-on-epoll_ctl_del.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/epoll-do-not-take-the-nested-ep-mtx-on-epoll_ctl_del.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Jason Baron <jbaron@xxxxxxxxxx>
Subject: epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL

The EPOLL_CTL_DEL path of epoll contains a classic, ab-ba deadlock.  That
is, epoll_ctl(a, EPOLL_CTL_DEL, b, x), will deadlock with epoll_ctl(b,
EPOLL_CTL_DEL, a, x).  The deadlock was introduced with commmit
67347fe4e6326 ("epoll: do not take global 'epmutex' for simple
topologies").

The acquistion of the ep->mtx for the destination 'ep' was added such that
a concurrent EPOLL_CTL_ADD operation would see the correct state of the
ep.  (Specifically, the check for '!list_empty(&f.file->f_ep_links'.)
However, by simply not acquiring the lock, we do not serialize behind the
ep->mtx from the add path, and thus may perform a full path check when if
we had waited a little longer it may not have been necessary.  However,
this is a transient state, and performing the full loop checking in this
case is not harmful.

The important point is that we wouldn't miss doing the full loop checking
when required, since EPOLL_CTL_ADD always locks any 'ep's that its
operating upon.  The reason we don't need to do lock ordering in the add
path, is that we are already are holding the global 'epmutex' whenever we
do the double lock.  Further, the original posting of this patch, which
was tested for the intended performance gains, did not perform this
additional locking.

Signed-off-by: Jason Baron <jbaron@xxxxxxxxxx>
Cc: Nathan Zimmer <nzimmer@xxxxxxx>
Cc: Eric Wong <normalperson@xxxxxxxx>
Cc: Nelson Elhage <nelhage@xxxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Davide Libenzi <davidel@xxxxxxxxxxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 fs/eventpoll.c |    4 ----
 1 file changed, 4 deletions(-)

diff -puN fs/eventpoll.c~epoll-do-not-take-the-nested-ep-mtx-on-epoll_ctl_del fs/eventpoll.c
--- a/fs/eventpoll.c~epoll-do-not-take-the-nested-ep-mtx-on-epoll_ctl_del
+++ a/fs/eventpoll.c
@@ -1907,10 +1907,6 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, in
 			}
 		}
 	}
-	if (op == EPOLL_CTL_DEL && is_file_epoll(tf.file)) {
-		tep = tf.file->private_data;
-		mutex_lock_nested(&tep->mtx, 1);
-	}
 
 	/*
 	 * Try to lookup the file inside our RB tree, Since we grabbed "mtx"
_

Patches currently in -mm which might be from jbaron@xxxxxxxxxx are

epoll-do-not-take-the-nested-ep-mtx-on-epoll_ctl_del.patch
lib-parserc-add-match_wildcard-function.patch
dynamic_debug-add-wildcard-support-to-filter-files-functions-modules.patch
dynamic-debug-howtotxt-update-since-new-wildcard-support.patch
linux-next.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux