avoid crashes from faulty crushmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I messed up a crush map the other day, mixing components of different
types in a single rule.  The crushmap compiler didn't complain, but mons
and osds would crash when applying those rules.  I had to use this patch
to recover the cluster.  Only the second hunk was relevant, but I
figured a BUG_ON that stops you from fixing the problem is best avoided
;-)

--- Begin Message ---
It's very hard to recover from an invalid crushmap if mons fail
assertions while processing the map, and osds crash while advancing
past an already-fixed map.  Skip such broken rules instead of
aborting.

Signed-off-by: Alexandre Oliva <oliva@xxxxxxxxxxxxxxxxx>
---
 src/crush/mapper.c |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/crush/mapper.c b/src/crush/mapper.c
index 1e475b40..6ce4c97 100644
--- a/src/crush/mapper.c
+++ b/src/crush/mapper.c
@@ -354,7 +354,11 @@ static int crush_choose(const struct crush_map *map,
 					item = bucket_perm_choose(in, x, r);
 				else
 					item = crush_bucket_choose(in, x, r);
-				BUG_ON(item >= map->max_devices);
+				if (item >= map->max_devices) {
+					dprintk("  bad item %d\n", item);
+					skip_rep = 1;
+					break;
+				}
 
 				/* desired type? */
 				if (item < 0)
@@ -365,8 +369,12 @@ static int crush_choose(const struct crush_map *map,
 
 				/* keep going? */
 				if (itemtype != type) {
-					BUG_ON(item >= 0 ||
-					       (-1-item) >= map->max_buckets);
+					if (item >= 0 ||
+					    (-1-item) >= map->max_buckets) {
+						dprintk("  bad item type %d\n", type)
+						skip_rep = 1;
+						break;
+					}
 					in = map->buckets[-1-item];
 					retry_bucket = 1;
 					continue;
-- 
1.7.7.6


--- End Message ---

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux