On Tue, Sep 20, 2016 at 11:26 PM, Iago Abal <iari@xxxxxx> wrote: > Hi, > > I may have found a double mutex_lock in net/ceph/auth.c and, if confirmed, I > could help with a patch. The trace leading to the error is as follows: > > 1. Starting in function ceph_build_auth, the first lock is at 261: > mutex_lock(& ac->mutex); > (see > https://github.com/torvalds/linux/blob/master/net/ceph/auth.c#L261) > 2. Assume that condition (!ac->protocol) evaluates to true at 262. > 3. Call function ceph_auth_build_hello at 263, passing the same `ac'. > 4. The second lock is at 108: mutex_lock(& ac->mutex); > (see > https://github.com/torvalds/linux/blob/master/net/ceph/auth.c#L108) > > This seems to have been introduced by commit e9966076cdd9 ("libceph: wrap > auth methods in a mutex"). > > Function ceph_auth_build_hello is extern, so it makes sense that it takes > the ac->mutex lock itself. On the other hand, ceph_build_auth_request is > static, so it's fine if the caller is responsible for acquiring the > ac->mutex. > > Assuming that this is a bug I can think of two simple solutions: > > Solution 1: Add an extra argument to ceph_auth_build_hello(..., int lock) so > that the caller can decide whether to acquire the lock or not. > > Solution 2: Rewrite lines 261-266 so that ceph_auth_build_hello is called > without the lock held. For instance, > > if (!ac->protocol) { > + mutex_unlock(ac->mutex); > ret = ceph_auth_build_hello(ac, msg_buf, msg_len); > } > > PS: I have reported this bug back in June, but I didn't receive any > feedback. I believe this can be an actual bug, so I'm insisting. :-) Hi Iago, Sorry about that. I think it doesn't come up in practice because ceph_build_auth() is only called when validating the existing auth, which means ac->protocol != 0. I'll take a look. Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html