Hi, On Mon, Oct 15, 2018 at 3:39 PM Matthias Kaehlcke <mka@xxxxxxxxxxxx> wrote: > > With commit e16337622016 ("Bluetooth: Handle bt_accept_enqueue() socket > atomically") lock_sock[_nested]() is used to acquire the socket lock > before manipulating the socket. lock_sock[_nested]() may block, which > is problematic since bt_accept_enqueue() can be called in bottom half > context (e.g. from rfcomm_connect_ind()). > > The socket API provides bh_lock_sock[_nested]() to acquire the socket > lock in bottom half context. Check the context in bt_accept_enqueue() > and use the appropriate locking mechanism for the context. I wonder if it would help to put the stack crawl in the commit message too? I think this is what the BUG was reporting (though the stack seems a bet shorter than I'd expect due to compiler inlining) __might_sleep+0x4c/0x80 lock_sock_nested+0x24/0x58 bt_accept_enqueue+0x48/0xd4 [bluetooth] rfcomm_connect_ind+0x190/0x218 [rfcomm] rfcomm_run+0xe3c/0x163c [rfcomm] > Fixes: e16337622016 ("Bluetooth: Handle bt_accept_enqueue() socket atomically") > Signed-off-by: Matthias Kaehlcke <mka@xxxxxxxxxxxx> > --- > Not sure if this is the correct solution, it's certainly not elegant and > checkpatch.pl complains that in_atomic() shouldn't be used outside of > core kernel code. I'm open to other suggestions :) I'm a total noob when it comes to Bluetooth, so I guess I'll try to understand the callchains here. As far as I can tell there are only 3 calls to bt_accept_enqueue, right? I guess we can look at each of them: -- 1. net/bluetooth/l2cap_sock.c: Parent is locked with lock_sock(). -- 2. net/bluetooth/rfcomm/sock.c: Parent is locked with bh_lock_sock(). This is the case you were seeing. -- 3. net/bluetooth/sco.c sco_conn_ready() => __sco_chan_add => bt_accept_enqueue() ...parent is locked with bh_lock_sock(). sco_connect() => sco_chan_add() => __sco_chan_add ...parent is NULL so we actually never call bt_accept_enqueue() in this callchain. So the net result is that the parent is locked with bh_lock_sock(). -- >From looking at the above I guess it's pretty simple--if our parent was locked with bh_lock_sock() then the child should be locked with bh_lock_sock_nested(). If the parent was locked with lock_sock() then the child should be locked with lock_sock_nested(). I wonder if a less controversial solution here (and one that wouldn't upset checkpatch) is to just add a parameter to bt_accept_enqueue() like "bh". Then you'd pass true for "bh" in "net/bluetooth/rfcomm/sock.c" and "net/bluetooth/sco.c" and false from "net/bluetooth/l2cap_sock.c" -- In any case the problem seems serious enough that I'd propose a Revert of commit e16337622016 ("Bluetooth: Handle bt_accept_enqueue() socket atomically") unless we can come up with an acceptable solution. -Doug