Hello, I am running a tiny cluster of 2 nodes. ceph -v ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) One osd died and I added a new osd (not replacing the old one). After that I wanted to remove the failed osd completely from the cluster. Here is what I did: ceph osd reweight osd.4 0.0 ceph osd crush reweight osd.4 0.0 ceph osd out osd.4 ceph osd crush remove osd.4 ceph auth del osd.4 ceph osd rm osd.4 But after the rebalancing I ended up with 155 PGs in stale+active+clean state. @storage1:/tmp# ceph -s cluster 7a9120b9-df42-4308-b7b1-e1f3d0f1e7b3 health HEALTH_WARN 155 pgs stale; 155 pgs stuck stale; 1 requests are blocked > 32 sec; nodeep-scrub flag(s) set monmap e1: 1 mons at {storage1=192.168.10.3:6789/0}, election epoch 1, quorum 0 storage1 osdmap e1064: 6 osds: 6 up, 6 in flags nodeep-scrub pgmap v26760322: 712 pgs, 8 pools, 532 GB data, 155 kobjects 1209 GB used, 14210 GB / 15419 GB avail 155 stale+active+clean 557 active+clean client io 91925 B/s wr, 5 op/s I know about the 1 monitor problem I just want to fix the cluster to healthy state then I will add the third storage node and go up to 3 monitors. The problem is as follows: @storage1:/tmp# ceph pg map 2.3a osdmap e1064 pg 2.3a (2.3a) -> up [6] acting [6] @storage1:/tmp# ceph pg 2.3a query Error ENOENT: i don't have pgid 2.3a @storage1:/tmp# ceph health detail HEALTH_WARN 155 pgs stale; 155 pgs stuck stale; 1 requests are blocked > 32 sec; 1 osds have slow requests; nodeep-scrub flag(s) set pg 7.2a is stuck stale for 8887559.656879, current state stale+active+clean, last acting [4] pg 5.28 is stuck stale for 8887559.656886, current state stale+active+clean, last acting [4] pg 7.2b is stuck stale for 8887559.656889, current state stale+active+clean, last acting [4] pg 7.2c is stuck stale for 8887559.656892, current state stale+active+clean, last acting [4] pg 0.2b is stuck stale for 8887559.656893, current state stale+active+clean, last acting [4] pg 6.2c is stuck stale for 8887559.656894, current state stale+active+clean, last acting [4] pg 6.2f is stuck stale for 8887559.656893, current state stale+active+clean, last acting [4] pg 2.2b is stuck stale for 8887559.656896, current state stale+active+clean, last acting [4] pg 2.25 is stuck stale for 8887559.656896, current state stale+active+clean, last acting [4] pg 6.20 is stuck stale for 8887559.656898, current state stale+active+clean, last acting [4] pg 5.21 is stuck stale for 8887559.656898, current state stale+active+clean, last acting [4] pg 0.24 is stuck stale for 8887559.656904, current state stale+active+clean, last acting [4] pg 2.21 is stuck stale for 8887559.656904, current state stale+active+clean, last acting [4] pg 5.27 is stuck stale for 8887559.656906, current state stale+active+clean, last acting [4] pg 2.23 is stuck stale for 8887559.656908, current state stale+active+clean, last acting [4] pg 6.26 is stuck stale for 8887559.656909, current state stale+active+clean, last acting [4] pg 7.27 is stuck stale for 8887559.656913, current state stale+active+clean, last acting [4] pg 7.18 is stuck stale for 8887559.656914, current state stale+active+clean, last acting [4] pg 0.1e is stuck stale for 8887559.656914, current state stale+active+clean, last acting [4] pg 6.18 is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4] pg 2.1f is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4] pg 7.1b is stuck stale for 8887559.656922, current state stale+active+clean, last acting [4] pg 0.1b is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4] pg 6.1d is stuck stale for 8887559.656925, current state stale+active+clean, last acting [4] pg 2.18 is stuck stale for 8887559.656920, current state stale+active+clean, last acting [4] pg 7.1d is stuck stale for 8887559.656926, current state stale+active+clean, last acting [4] pg 5.1c is stuck stale for 8887559.656921, current state stale+active+clean, last acting [4] pg 5.1d is stuck stale for 8887559.656920, current state stale+active+clean, last acting [4] pg 6.11 is stuck stale for 8887559.656922, current state stale+active+clean, last acting [4] pg 5.13 is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4] pg 0.16 is stuck stale for 8887559.656924, current state stale+active+clean, last acting [4] pg 6.10 is stuck stale for 8887559.656928, current state stale+active+clean, last acting [4] pg 2.17 is stuck stale for 8887559.656927, current state stale+active+clean, last acting [4] pg 7.12 is stuck stale for 8887559.656932, current state stale+active+clean, last acting [4] pg 0.12 is stuck stale for 8887559.656929, current state stale+active+clean, last acting [4] pg 6.14 is stuck stale for 8887559.656935, current state stale+active+clean, last acting [4] pg 0.11 is stuck stale for 8887559.656932, current state stale+active+clean, last acting [4] pg 7.16 is stuck stale for 8887559.656936, current state stale+active+clean, last acting [4] pg 0.10 is stuck stale for 8887559.656936, current state stale+active+clean, last acting [4] pg 2.d is stuck stale for 8887559.656933, current state stale+active+clean, last acting [4] pg 6.9 is stuck stale for 8887559.656939, current state stale+active+clean, last acting [4] pg 7.9 is stuck stale for 8887559.656939, current state stale+active+clean, last acting [4] pg 0.d is stuck stale for 8887559.656940, current state stale+active+clean, last acting [4] pg 7.a is stuck stale for 8887559.656944, current state stale+active+clean, last acting [4] pg 0.c is stuck stale for 8887559.656941, current state stale+active+clean, last acting [4] pg 2.e is stuck stale for 8887559.656947, current state stale+active+clean, last acting [4] pg 6.a is stuck stale for 8887559.656953, current state stale+active+clean, last acting [4] pg 0.b is stuck stale for 8887559.656949, current state stale+active+clean, last acting [4] pg 2.9 is stuck stale for 8887559.656954, current state stale+active+clean, last acting [4] pg 5.f is stuck stale for 8887559.656953, current state stale+active+clean, last acting [4] pg 7.d is stuck stale for 8887559.656958, current state stale+active+clean, last acting [4] pg 6.f is stuck stale for 8887559.656957, current state stale+active+clean, last acting [4] pg 3.4 is stuck stale for 8887559.656957, current state stale+active+clean, last acting [4] pg 5.3 is stuck stale for 8887559.656956, current state stale+active+clean, last acting [4] pg 2.4 is stuck stale for 8887559.656961, current state stale+active+clean, last acting [4] pg 6.0 is stuck stale for 8887559.656966, current state stale+active+clean, last acting [4] pg 3.6 is stuck stale for 8887559.656965, current state stale+active+clean, last acting [4] pg 3.7 is stuck stale for 8887559.656964, current state stale+active+clean, last acting [4] pg 2.6 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4] pg 0.3 is stuck stale for 8887559.656965, current state stale+active+clean, last acting [4] pg 5.6 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4] pg 7.4 is stuck stale for 8887559.656975, current state stale+active+clean, last acting [4] pg 3.1 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4] pg 6.4 is stuck stale for 8887559.656975, current state stale+active+clean, last acting [4] pg 5.4 is stuck stale for 8887559.656972, current state stale+active+clean, last acting [4] pg 2.3 is stuck stale for 8887559.656977, current state stale+active+clean, last acting [4] pg 5.5 is stuck stale for 8887559.656977, current state stale+active+clean, last acting [4] pg 3.3 is stuck stale for 8887559.656982, current state stale+active+clean, last acting [4] pg 5.7a is stuck stale for 8887559.657309, current state stale+active+clean, last acting [4] pg 6.78 is stuck stale for 8887559.657308, current state stale+active+clean, last acting [4] pg 5.78 is stuck stale for 8887559.657311, current state stale+active+clean, last acting [4] pg 5.79 is stuck stale for 8887559.657311, current state stale+active+clean, last acting [4] pg 6.7c is stuck stale for 8887559.657313, current state stale+active+clean, last acting [4] pg 7.7e is stuck stale for 8887559.657312, current state stale+active+clean, last acting [4] pg 6.7e is stuck stale for 8887559.657315, current state stale+active+clean, last acting [4] pg 7.70 is stuck stale for 8887559.657316, current state stale+active+clean, last acting [4] pg 6.73 is stuck stale for 8887559.657316, current state stale+active+clean, last acting [4] pg 5.77 is stuck stale for 8887559.657317, current state stale+active+clean, last acting [4] pg 5.74 is stuck stale for 8887559.657319, current state stale+active+clean, last acting [4] pg 5.75 is stuck stale for 8887559.657321, current state stale+active+clean, last acting [4] pg 7.68 is stuck stale for 8887559.657322, current state stale+active+clean, last acting [4] pg 6.68 is stuck stale for 8887559.657324, current state stale+active+clean, last acting [4] pg 7.6b is stuck stale for 8887559.657326, current state stale+active+clean, last acting [4] pg 6.6d is stuck stale for 8887559.657328, current state stale+active+clean, last acting [4] pg 5.6e is stuck stale for 8887559.657330, current state stale+active+clean, last acting [4] pg 6.6c is stuck stale for 8887559.657330, current state stale+active+clean, last acting [4] pg 7.6f is stuck stale for 8887559.657331, current state stale+active+clean, last acting [4] pg 7.60 is stuck stale for 8887559.657333, current state stale+active+clean, last acting [4] pg 6.60 is stuck stale for 8887559.657333, current state stale+active+clean, last acting [4] pg 7.62 is stuck stale for 8887559.657334, current state stale+active+clean, last acting [4] pg 6.65 is stuck stale for 8887559.657334, current state stale+active+clean, last acting [4] pg 7.64 is stuck stale for 8887559.657339, current state stale+active+clean, last acting [4] pg 5.67 is stuck stale for 8887559.657338, current state stale+active+clean, last acting [4] pg 7.66 is stuck stale for 8887559.657340, current state stale+active+clean, last acting [4] pg 6.66 is stuck stale for 8887559.657340, current state stale+active+clean, last acting [4] pg 7.67 is stuck stale for 8887559.657345, current state stale+active+clean, last acting [4] pg 6.59 is stuck stale for 8887559.657344, current state stale+active+clean, last acting [4] pg 7.58 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4] pg 6.58 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4] pg 7.59 is stuck stale for 8887559.657352, current state stale+active+clean, last acting [4] pg 6.5b is stuck stale for 8887559.657353, current state stale+active+clean, last acting [4] pg 5.59 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4] pg 6.5a is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4] pg 5.5e is stuck stale for 8887559.657352, current state stale+active+clean, last acting [4] pg 6.5d is stuck stale for 8887559.657358, current state stale+active+clean, last acting [4] pg 6.5f is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4] pg 7.51 is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4] pg 7.52 is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4] pg 7.53 is stuck stale for 8887559.657358, current state stale+active+clean, last acting [4] pg 6.55 is stuck stale for 8887559.657359, current state stale+active+clean, last acting [4] pg 7.54 is stuck stale for 8887559.657364, current state stale+active+clean, last acting [4] pg 6.54 is stuck stale for 8887559.657364, current state stale+active+clean, last acting [4] pg 6.57 is stuck stale for 8887559.657365, current state stale+active+clean, last acting [4] pg 7.56 is stuck stale for 8887559.657369, current state stale+active+clean, last acting [4] pg 5.55 is stuck stale for 8887559.657371, current state stale+active+clean, last acting [4] pg 7.48 is stuck stale for 8887559.657372, current state stale+active+clean, last acting [4] pg 6.49 is stuck stale for 8887559.657375, current state stale+active+clean, last acting [4] pg 5.4a is stuck stale for 8887559.657376, current state stale+active+clean, last acting [4] pg 6.48 is stuck stale for 8887559.657379, current state stale+active+clean, last acting [4] pg 7.4a is stuck stale for 8887559.657380, current state stale+active+clean, last acting [4] pg 6.4a is stuck stale for 8887559.657383, current state stale+active+clean, last acting [4] pg 6.4d is stuck stale for 8887559.657385, current state stale+active+clean, last acting [4] pg 7.4d is stuck stale for 8887559.657387, current state stale+active+clean, last acting [4] pg 6.4c is stuck stale for 8887559.657389, current state stale+active+clean, last acting [4] pg 6.4e is stuck stale for 8887559.657391, current state stale+active+clean, last acting [4] pg 5.42 is stuck stale for 8887559.657391, current state stale+active+clean, last acting [4] pg 6.43 is stuck stale for 8887559.657393, current state stale+active+clean, last acting [4] pg 5.41 is stuck stale for 8887559.657393, current state stale+active+clean, last acting [4] pg 5.47 is stuck stale for 8887559.657394, current state stale+active+clean, last acting [4] pg 7.46 is stuck stale for 8887559.657396, current state stale+active+clean, last acting [4] pg 6.39 is stuck stale for 8887559.657398, current state stale+active+clean, last acting [4] pg 5.3a is stuck stale for 8887559.657399, current state stale+active+clean, last acting [4] pg 2.3e is stuck stale for 8887559.657399, current state stale+active+clean, last acting [4] pg 0.3c is stuck stale for 8887559.657402, current state stale+active+clean, last acting [4] pg 7.3c is stuck stale for 8887559.657404, current state stale+active+clean, last acting [4] pg 7.3d is stuck stale for 8887559.657405, current state stale+active+clean, last acting [4] pg 0.39 is stuck stale for 8887559.657402, current state stale+active+clean, last acting [4] pg 5.3c is stuck stale for 8887559.657405, current state stale+active+clean, last acting [4] pg 2.3a is stuck stale for 8887559.657406, current state stale+active+clean, last acting [4] pg 0.38 is stuck stale for 8887559.657409, current state stale+active+clean, last acting [4] pg 2.35 is stuck stale for 8887559.657411, current state stale+active+clean, last acting [4] pg 0.37 is stuck stale for 8887559.657412, current state stale+active+clean, last acting [4] pg 5.32 is stuck stale for 8887559.657413, current state stale+active+clean, last acting [4] pg 2.34 is stuck stale for 8887559.657416, current state stale+active+clean, last acting [4] pg 0.36 is stuck stale for 8887559.657416, current state stale+active+clean, last acting [4] pg 7.32 is stuck stale for 8887559.657419, current state stale+active+clean, last acting [4] pg 6.33 is stuck stale for 8887559.657420, current state stale+active+clean, last acting [4] pg 0.35 is stuck stale for 8887559.657423, current state stale+active+clean, last acting [4] pg 6.35 is stuck stale for 8887559.657423, current state stale+active+clean, last acting [4] pg 5.36 is stuck stale for 8887559.657424, current state stale+active+clean, last acting [4] pg 2.30 is stuck stale for 8887559.657427, current state stale+active+clean, last acting [4] pg 5.37 is stuck stale for 8887559.657429, current state stale+active+clean, last acting [4] pg 7.36 is stuck stale for 8887559.657430, current state stale+active+clean, last acting [4] pg 6.37 is stuck stale for 8887559.657432, current state stale+active+clean, last acting [4] pg 6.28 is stuck stale for 8887559.657427, current state stale+active+clean, last acting [4] This stays that way and I think this is because when I downloaded and decompiled the crush map I discovered this: @storage1:/tmp# crushtool -d /tmp/crushmap # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 device4 device 5 osd.5 device 6 osd.6 Is there a way to remove this device 4 aka osd.4 from here so ceph can make another copy from the other location shown in “ceph pg map 2.3a” ? Regards. Dimitar Boichev SysAdmin Team Lead AXSMarine Sofia Phone: +359 889 22 55 42 Skype: dimitar.boichev.axsmarine E-mail:
dimitar.boichev@xxxxxxxxxxxxx |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com