Patch "libceph: must hold mutex for reset_changed_osds()" has been added to the 3.4-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    libceph: must hold mutex for reset_changed_osds()

to the 3.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     libceph-must-hold-mutex-for-reset_changed_osds.patch
and it can be found in the queue-3.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From 14d2f38df67fadee34625fcbd282ee22514c4846 Mon Sep 17 00:00:00 2001
From: Alex Elder <elder@xxxxxxxxxxx>
Date: Wed, 15 May 2013 16:28:33 -0500
Subject: libceph: must hold mutex for reset_changed_osds()

From: Alex Elder <elder@xxxxxxxxxxx>

commit 14d2f38df67fadee34625fcbd282ee22514c4846 upstream.

An osd client has a red-black tree describing its osds, and
occasionally we would get crashes due to one of these trees tree
becoming corrupt somehow.

The problem turned out to be that reset_changed_osds() was being
called without protection of the osd client request mutex.  That
function would call __reset_osd() for any osd that had changed, and
__reset_osd() would call __remove_osd() for any osd with no
outstanding requests, and finally __remove_osd() would remove the
corresponding entry from the red-black tree.  Thus, the tree was
getting modified without having any lock protection, and was
vulnerable to problems due to concurrent updates.

This appears to be the only osd tree updating path that has this
problem.  It can be fairly easily fixed by moving the call up
a few lines, to just before the request mutex gets dropped
in kick_requests().

This resolves:
    http://tracker.ceph.com/issues/5043

Signed-off-by: Alex Elder <elder@xxxxxxxxxxx>
Reviewed-by: Sage Weil <sage@xxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
 net/ceph/osd_client.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1337,13 +1337,13 @@ static void kick_requests(struct ceph_os
 		__register_request(osdc, req);
 		__unregister_linger_request(osdc, req);
 	}
+	reset_changed_osds(osdc);
 	mutex_unlock(&osdc->request_mutex);
 
 	if (needmap) {
 		dout("%d requests for down osds, need new map\n", needmap);
 		ceph_monc_request_next_osdmap(&osdc->client->monc);
 	}
-	reset_changed_osds(osdc);
 }
 
 


Patches currently in stable-queue which might be from elder@xxxxxxxxxxx are

queue-3.4/libceph-must-hold-mutex-for-reset_changed_osds.patch
queue-3.4/ceph-add-cpu_to_le32-calls-when-encoding-a-reconnect-capability.patch
queue-3.4/ceph-ceph_pagelist_append-might-sleep-while-atomic.patch
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]