I still have the pgs stuck peering. I ran ceph pg n.nn query on a few of the pgs that are stuck. The ones that are just peering have a few entries in recovery_state -> past_intervals (Example at end of message) and the ones that say remapped+peering have a long entry here. I don't know what the content of pg query is but I have a ffeling that I have had writes to different nodes and that has messed up a few objects. I have a lot of network traffic between the nodes, a few hundred Mbps which would fit with osds trying to work out their state (9 disks with a random IO pattern would fit with the level of bandwidth i'm seeing).
This is the full output of ceph health detailsudo ceph health detail
HEALTH_WARN 82 pgs peering; 82 pgs stuck inactive; 82 pgs stuck unclean; 1 requests are blocked > 32 sec; 1 osds have slow requests; pool images pg_num 256 > pgp_num 128
pg 3.21 is stuck inactive for 115937.161742, current state peering, last acting [7,5]
pg 3.80 is stuck inactive for 115913.708453, current state peering, last acting [8,6]
pg 3.23 is stuck inactive for 156640.618069, current state peering, last acting [8,3]
pg 3.82 is stuck inactive for 115931.967078, current state peering, last acting [1,5]
pg 3.e1 is stuck inactive for 116121.694227, current state peering, last acting [0,6]
pg 3.1c is stuck inactive for 115916.431120, current state peering, last acting [8,3]
pg 3.7e is stuck inactive for 115918.390949, current state peering, last acting [0,3]
pg 3.18 is stuck inactive for 115908.250832, current state peering, last acting [8,6]
pg 3.79 is stuck inactive for 115914.617676, current state peering, last acting [8,3]
pg 3.d8 is stuck inactive for 116341.813279, current state peering, last acting [2,6]
pg 3.1b is stuck inactive for 115905.061074, current state peering, last acting [7,4]
pg 3.d9 is stuck inactive for 156650.199216, current state peering, last acting [8,3]
pg 3.db is stuck inactive for 115915.924073, current state peering, last acting [1,5]
pg 3.d4 is stuck inactive for 115918.396086, current state peering, last acting [0,3]
pg 3.17 is stuck inactive for 115915.304764, current state peering, last acting [0,3]
pg 3.70 is stuck inactive for 115915.000395, current state peering, last acting [7,6]
pg 3.12 is stuck inactive for 115916.466955, current state peering, last acting [8,3]
pg 3.13 is stuck inactive for 244912.512309, current state remapped+peering, last acting [6,0]
pg 3.d2 is stuck inactive for 115913.708294, current state peering, last acting [8,3]
pg 3.6d is stuck inactive for 115909.860193, current state peering, last acting [8,4]
pg 3.6e is stuck inactive for 115914.617561, current state peering, last acting [8,3]
pg 3.9 is stuck inactive for 244908.745661, current state remapped+peering, last acting [4,2]
pg 3.68 is stuck inactive for 115916.701060, current state peering, last acting [7,3]
pg 3.6a is stuck inactive for 115914.617589, current state peering, last acting [8,3]
pg 3.4 is stuck inactive for 115913.708054, current state peering, last acting [8,3]
pg 3.ca is stuck inactive for 115915.923728, current state peering, last acting [0,6]
pg 3.64 is stuck inactive for 115905.061782, current state peering, last acting [7,4]
pg 3.6 is stuck inactive for 115913.708077, current state peering, last acting [8,3]
pg 3.0 is stuck inactive for 116106.189550, current state peering, last acting [8,6]
pg 3.c6 is stuck inactive for 115905.061588, current state peering, last acting [7,4]
pg 3.2 is stuck inactive for 116351.261968, current state peering, last acting [1,5]
pg 3.61 is stuck inactive for 115913.854102, current state peering, last acting [0,6]
pg 3.c0 is stuck inactive for 115916.700785, current state peering, last acting [7,3]
pg 3.c2 is stuck inactive for 115913.708368, current state peering, last acting [8,6]
pg 3.bd is stuck inactive for 115909.142185, current state peering, last acting [0,4]
pg 3.58 is stuck inactive for 116290.453805, current state peering, last acting [2,6]
pg 3.59 is stuck inactive for 156592.727428, current state peering, last acting [8,3]
pg 3.5b is stuck inactive for 115915.927480, current state peering, last acting [1,5]
pg 3.54 is stuck inactive for 115918.391135, current state peering, last acting [0,3]
pg 3.bb is stuck inactive for 115918.138327, current state peering, last acting [0,3]
pg 3.b5 is stuck inactive for 156609.811401, current state peering, last acting [7,3]
pg 3.52 is stuck inactive for 115914.617727, current state peering, last acting [8,3]
pg 3.b1 is stuck inactive for 115910.407513, current state peering, last acting [1,4]
pg 3.b3 is stuck inactive for 116204.050176, current state peering, last acting [0,6]
pg 3.af is stuck inactive for 115908.304844, current state peering, last acting [1,6]
pg 3.a8 is stuck inactive for 115909.753895, current state peering, last acting [8,5]
pg 3.4a is stuck inactive for 115913.854219, current state peering, last acting [0,6]
pg 3.a9 is stuck inactive for 115905.061347, current state peering, last acting [7,4]
pg 3.a4 is stuck inactive for 115909.753923, current state peering, last acting [8,4]
pg 3.46 is stuck inactive for 115905.061894, current state peering, last acting [7,4]
pg 3.40 is stuck inactive for 115916.701055, current state peering, last acting [7,3]
pg 3.a0 is stuck inactive for 156540.416593, current state peering, last acting [7,6]
pg 3.42 is stuck inactive for 116084.025651, current state peering, last acting [8,6]
pg 3.a1 is stuck inactive for 115905.061404, current state peering, last acting [7,5]
pg 3.a3 is stuck inactive for 156592.632676, current state peering, last acting [8,3]
pg 3.3d is stuck inactive for 115909.536349, current state peering, last acting [0,4]
pg 3.9c is stuck inactive for 115913.639973, current state peering, last acting [8,3]
pg 3.fe is stuck inactive for 115915.304682, current state peering, last acting [0,3]
pg 3.98 is stuck inactive for 115908.287692, current state peering, last acting [8,6]
pg 3.f9 is stuck inactive for 115913.708198, current state peering, last acting [8,3]
pg 3.3b is stuck inactive for 115915.304652, current state peering, last acting [0,3]
pg 3.9b is stuck inactive for 115905.061445, current state peering, last acting [7,4]
pg 3.35 is stuck inactive for 156760.780737, current state peering, last acting [7,3]
pg 3.97 is stuck inactive for 115913.854036, current state peering, last acting [0,3]
pg 3.31 is stuck inactive for 115910.565637, current state peering, last acting [1,4]
pg 3.f0 is stuck inactive for 115915.000192, current state peering, last acting [7,6]
pg 3.33 is stuck inactive for 115908.911398, current state peering, last acting [0,6]
pg 3.92 is stuck inactive for 115914.503597, current state peering, last acting [8,3]
pg 3.93 is stuck inactive for 244912.512404, current state remapped+peering, last acting [6,0]
pg 3.2f is stuck inactive for 115980.326105, current state peering, last acting [1,6]
pg 3.ed is stuck inactive for 115909.859689, current state peering, last acting [8,4]
pg 3.28 is stuck inactive for 115913.708757, current state peering, last acting [8,5]
pg 3.ee is stuck inactive for 115913.708285, current state peering, last acting [8,3]
pg 3.29 is stuck inactive for 115905.062092, current state peering, last acting [7,4]
pg 3.89 is stuck inactive for 244908.745759, current state remapped+peering, last acting [4,2]
pg 3.e8 is stuck inactive for 115916.700729, current state peering, last acting [7,3]
pg 3.24 is stuck inactive for 115909.860570, current state peering, last acting [8,4]
pg 3.ea is stuck inactive for 115913.708316, current state peering, last acting [8,3]
pg 3.84 is stuck inactive for 115913.708549, current state peering, last acting [8,3]
pg 3.e4 is stuck inactive for 115905.061352, current state peering, last acting [7,4]
pg 3.86 is stuck inactive for 115914.617720, current state peering, last acting [8,3]
pg 3.20 is stuck inactive for 156654.164647, current state peering, last acting [7,6]
pg 3.21 is stuck unclean for 115937.161932, current state peering, last acting [7,5]
pg 3.80 is stuck unclean for 115913.708641, current state peering, last acting [8,6]
pg 3.23 is stuck unclean for 156640.618257, current state peering, last acting [8,3]
pg 3.82 is stuck unclean for 115931.967266, current state peering, last acting [1,5]
pg 3.e1 is stuck unclean for 116121.694416, current state peering, last acting [0,6]
pg 3.1c is stuck unclean for 115916.431308, current state peering, last acting [8,3]
pg 3.7e is stuck unclean for 115918.391137, current state peering, last acting [0,3]
pg 3.18 is stuck unclean for 115908.251019, current state peering, last acting [8,6]
pg 3.79 is stuck unclean for 115914.617864, current state peering, last acting [8,3]
pg 3.d8 is stuck unclean for 116341.813466, current state peering, last acting [2,6]
pg 3.1b is stuck unclean for 115905.061262, current state peering, last acting [7,4]
pg 3.d9 is stuck unclean for 156650.199403, current state peering, last acting [8,3]
pg 3.db is stuck unclean for 115915.924260, current state peering, last acting [1,5]
pg 3.d4 is stuck unclean for 115918.396273, current state peering, last acting [0,3]
pg 3.17 is stuck unclean for 115915.304951, current state peering, last acting [0,3]
pg 3.70 is stuck unclean for 115915.000581, current state peering, last acting [7,6]
pg 3.12 is stuck unclean for 115916.467142, current state peering, last acting [8,3]
pg 3.13 is stuck unclean for 254650.057287, current state remapped+peering, last acting [6,0]
pg 3.d2 is stuck unclean for 115913.708481, current state peering, last acting [8,3]
pg 3.6d is stuck unclean for 115909.860380, current state peering, last acting [8,4]
pg 3.6e is stuck unclean for 115914.617747, current state peering, last acting [8,3]
pg 3.9 is stuck unclean for 255316.515662, current state remapped+peering, last acting [4,2]
pg 3.68 is stuck unclean for 115916.701246, current state peering, last acting [7,3]
pg 3.6a is stuck unclean for 115914.617775, current state peering, last acting [8,3]
pg 3.4 is stuck unclean for 115913.708241, current state peering, last acting [8,3]
pg 3.ca is stuck unclean for 115915.923915, current state peering, last acting [0,6]
pg 3.64 is stuck unclean for 115905.061969, current state peering, last acting [7,4]
pg 3.6 is stuck unclean for 115913.708264, current state peering, last acting [8,3]
pg 3.0 is stuck unclean for 116106.189737, current state peering, last acting [8,6]
pg 3.c6 is stuck unclean for 115905.061775, current state peering, last acting [7,4]
pg 3.2 is stuck unclean for 116351.262155, current state peering, last acting [1,5]
pg 3.61 is stuck unclean for 115913.854289, current state peering, last acting [0,6]
pg 3.c0 is stuck unclean for 115916.700973, current state peering, last acting [7,3]
pg 3.c2 is stuck unclean for 115913.708556, current state peering, last acting [8,6]
pg 3.bd is stuck unclean for 115909.142373, current state peering, last acting [0,4]
pg 3.58 is stuck unclean for 116290.453992, current state peering, last acting [2,6]
pg 3.59 is stuck unclean for 156592.727616, current state peering, last acting [8,3]
pg 3.5b is stuck unclean for 115915.927668, current state peering, last acting [1,5]
pg 3.54 is stuck unclean for 115918.391323, current state peering, last acting [0,3]
pg 3.bb is stuck unclean for 115918.138514, current state peering, last acting [0,3]
pg 3.b5 is stuck unclean for 156609.811589, current state peering, last acting [7,3]
pg 3.52 is stuck unclean for 115914.617914, current state peering, last acting [8,3]
pg 3.b1 is stuck unclean for 115910.407700, current state peering, last acting [1,4]
pg 3.b3 is stuck unclean for 116204.050364, current state peering, last acting [0,6]
pg 3.af is stuck unclean for 115908.305031, current state peering, last acting [1,6]
pg 3.a8 is stuck unclean for 115909.754082, current state peering, last acting [8,5]
pg 3.4a is stuck unclean for 115913.854406, current state peering, last acting [0,6]
pg 3.a9 is stuck unclean for 115905.061535, current state peering, last acting [7,4]
pg 3.a4 is stuck unclean for 115909.754111, current state peering, last acting [8,4]
pg 3.46 is stuck unclean for 115905.062087, current state peering, last acting [7,4]
pg 3.40 is stuck unclean for 115916.701248, current state peering, last acting [7,3]
pg 3.a0 is stuck unclean for 156540.416786, current state peering, last acting [7,6]
pg 3.42 is stuck unclean for 116084.025844, current state peering, last acting [8,6]
pg 3.a1 is stuck unclean for 115905.061597, current state peering, last acting [7,5]
pg 3.a3 is stuck unclean for 156592.632868, current state peering, last acting [8,3]
pg 3.3d is stuck unclean for 115909.536541, current state peering, last acting [0,4]
pg 3.9c is stuck unclean for 115913.640165, current state peering, last acting [8,3]
pg 3.fe is stuck unclean for 115915.304874, current state peering, last acting [0,3]
pg 3.98 is stuck unclean for 115908.287885, current state peering, last acting [8,6]
pg 3.f9 is stuck unclean for 115913.708390, current state peering, last acting [8,3]
pg 3.3b is stuck unclean for 115915.304844, current state peering, last acting [0,3]
pg 3.9b is stuck unclean for 115905.061638, current state peering, last acting [7,4]
pg 3.35 is stuck unclean for 156760.780929, current state peering, last acting [7,3]
pg 3.97 is stuck unclean for 115913.854229, current state peering, last acting [0,3]
pg 3.31 is stuck unclean for 115910.565829, current state peering, last acting [1,4]
pg 3.f0 is stuck unclean for 115915.000385, current state peering, last acting [7,6]
pg 3.33 is stuck unclean for 115908.911591, current state peering, last acting [0,6]
pg 3.92 is stuck unclean for 115914.503790, current state peering, last acting [8,3]
pg 3.93 is stuck unclean for 254650.057387, current state remapped+peering, last acting [6,0]
pg 3.2f is stuck unclean for 115980.326297, current state peering, last acting [1,6]
pg 3.ed is stuck unclean for 115909.859881, current state peering, last acting [8,4]
pg 3.28 is stuck unclean for 115913.708950, current state peering, last acting [8,5]
pg 3.ee is stuck unclean for 115913.708477, current state peering, last acting [8,3]
pg 3.29 is stuck unclean for 115905.062284, current state peering, last acting [7,4]
pg 3.89 is stuck unclean for 255316.515766, current state remapped+peering, last acting [4,2]
pg 3.e8 is stuck unclean for 115916.700921, current state peering, last acting [7,3]
pg 3.24 is stuck unclean for 115909.860762, current state peering, last acting [8,4]
pg 3.ea is stuck unclean for 115913.708507, current state peering, last acting [8,3]
pg 3.84 is stuck unclean for 115913.708741, current state peering, last acting [8,3]
pg 3.e4 is stuck unclean for 115905.061544, current state peering, last acting [7,4]
pg 3.86 is stuck unclean for 115914.617912, current state peering, last acting [8,3]
pg 3.20 is stuck unclean for 156654.164838, current state peering, last acting [7,6]
pg 3.ed is peering, acting [8,4]
pg 3.ee is peering, acting [8,3]
pg 3.e8 is peering, acting [7,3]
pg 3.ea is peering, acting [8,3]
pg 3.e4 is peering, acting [7,4]
pg 3.e1 is peering, acting [0,6]
pg 3.d8 is peering, acting [2,6]
pg 3.d9 is peering, acting [8,3]
pg 3.db is peering, acting [1,5]
pg 3.d4 is peering, acting [0,3]
pg 3.d2 is peering, acting [8,3]
pg 3.ca is peering, acting [0,6]
pg 3.c6 is peering, acting [7,4]
pg 3.c0 is peering, acting [7,3]
pg 3.c2 is peering, acting [8,6]
pg 3.bd is peering, acting [0,4]
pg 3.bb is peering, acting [0,3]
pg 3.b5 is peering, acting [7,3]
pg 3.b1 is peering, acting [1,4]
pg 3.b3 is peering, acting [0,6]
pg 3.af is peering, acting [1,6]
pg 3.a8 is peering, acting [8,5]
pg 3.a9 is peering, acting [7,4]
pg 3.a4 is peering, acting [8,4]
pg 3.a0 is peering, acting [7,6]
pg 3.a1 is peering, acting [7,5]
pg 3.a3 is peering, acting [8,3]
pg 3.9c is peering, acting [8,3]
pg 3.98 is peering, acting [8,6]
pg 3.9b is peering, acting [7,4]
pg 3.97 is peering, acting [0,3]
pg 3.92 is peering, acting [8,3]
pg 3.93 is remapped+peering, acting [6,0]
pg 3.89 is remapped+peering, acting [4,2]
pg 3.84 is peering, acting [8,3]
pg 3.86 is peering, acting [8,3]
pg 3.80 is peering, acting [8,6]
pg 3.82 is peering, acting [1,5]
pg 3.7e is peering, acting [0,3]
pg 3.79 is peering, acting [8,3]
pg 3.70 is peering, acting [7,6]
pg 3.6d is peering, acting [8,4]
pg 3.6e is peering, acting [8,3]
pg 3.68 is peering, acting [7,3]
pg 3.6a is peering, acting [8,3]
pg 3.64 is peering, acting [7,4]
pg 3.61 is peering, acting [0,6]
pg 3.58 is peering, acting [2,6]
pg 3.59 is peering, acting [8,3]
pg 3.5b is peering, acting [1,5]
pg 3.54 is peering, acting [0,3]
pg 3.52 is peering, acting [8,3]
pg 3.4a is peering, acting [0,6]
pg 3.46 is peering, acting [7,4]
pg 3.40 is peering, acting [7,3]
pg 3.42 is peering, acting [8,6]
pg 3.3d is peering, acting [0,4]
pg 3.3b is peering, acting [0,3]
pg 3.35 is peering, acting [7,3]
pg 3.31 is peering, acting [1,4]
pg 3.33 is peering, acting [0,6]
pg 3.2f is peering, acting [1,6]
pg 3.28 is peering, acting [8,5]
pg 3.29 is peering, acting [7,4]
pg 3.24 is peering, acting [8,4]
pg 3.20 is peering, acting [7,6]
pg 3.21 is peering, acting [7,5]
pg 3.23 is peering, acting [8,3]
pg 3.1c is peering, acting [8,3]
pg 3.18 is peering, acting [8,6]
pg 3.1b is peering, acting [7,4]
pg 3.17 is peering, acting [0,3]
pg 3.12 is peering, acting [8,3]
pg 3.13 is remapped+peering, acting [6,0]
pg 3.9 is remapped+peering, acting [4,2]
pg 3.4 is peering, acting [8,3]
pg 3.6 is peering, acting [8,3]
pg 3.0 is peering, acting [8,6]
pg 3.2 is peering, acting [1,5]
pg 3.fe is peering, acting [0,3]
pg 3.f9 is peering, acting [8,3]
pg 3.f0 is peering, acting [7,6]
1 ops are blocked > 134218 sec
1 ops are blocked > 134218 sec on osd.8
1 osds have slow requests
pool images pg_num 256 > pgp_num 128
{
"state": "peering",
"snap_trimq": "[]",
"epoch": 211256,
"up": [
7,
4
],
"acting": [
7,
4
],
"info": {
"pgid": "3.a9",
"last_update": "3359'110581",
"last_complete": "3359'110581",
"log_tail": "850'107578",
"last_user_version": 110581,
"last_backfill": "MAX",
"purged_snaps": "[]",
"history": {
"epoch_created": 31,
"last_epoch_started": 116841,
"last_epoch_clean": 116844,
"last_epoch_split": 0,
"same_up_since": 116838,
"same_interval_since": 126562,
"same_primary_since": 1202,
"last_scrub": "3359'110581",
"last_scrub_stamp": "2015-11-13 13:22:55.682647",
"last_deep_scrub": "987'109658",
"last_deep_scrub_stamp": "2015-11-09 13:56:36.850047",
"last_clean_scrub_stamp": "2015-11-13 13:22:55.682647"
},
"stats": {
"version": "3359'110581",
"reported_seq": "103843",
"reported_epoch": "211192",
"state": "peering",
"last_fresh": "2015-11-15 17:45:30.009129",
"last_change": "2015-11-14 11:25:20.451898",
"last_active": "2015-11-14 09:35:03.312840",
"last_peered": "2015-11-14 09:35:03.312840",
"last_clean": "2015-11-14 09:35:03.312840",
"last_became_active": "0.000000",
"last_became_peered": "0.000000",
"last_unstale": "2015-11-15 17:45:30.009129",
"last_undegraded": "2015-11-15 17:45:30.009129",
"last_fullsized": "2015-11-15 17:45:30.009129",
"mapping_epoch": 94611,
"log_start": "850'107578",
"ondisk_log_start": "850'107578",
"created": 31,
"last_epoch_clean": 116844,
"parent": "0.0",
"parent_split_bits": 0,
"last_scrub": "3359'110581",
"last_scrub_stamp": "2015-11-13 13:22:55.682647",
"last_deep_scrub": "987'109658",
"last_deep_scrub_stamp": "2015-11-09 13:56:36.850047",
"last_clean_scrub_stamp": "2015-11-13 13:22:55.682647",
"log_size": 3003,
"ondisk_log_size": 3003,
"stats_invalid": "1",
"stat_sum": {
"num_bytes": 18268690441,
"num_objects": 4402,
"num_object_clones": 0,
"num_object_copies": 8804,
"num_objects_missing_on_primary": 0,
"num_objects_degraded": 0,
"num_objects_misplaced": 0,
"num_objects_unfound": 0,
"num_objects_dirty": 4402,
"num_whiteouts": 0,
"num_read": 2268,
"num_read_kb": 31055,
"num_write": 8111,
"num_write_kb": 1762444,
"num_scrub_errors": 0,
"num_shallow_scrub_errors": 0,
"num_deep_scrub_errors": 0,
"num_objects_recovered": 13228,
"num_bytes_recovered": 54922698769,
"num_keys_recovered": 0,
"num_objects_omap": 0,
"num_objects_hit_set_archive": 0,
"num_bytes_hit_set_archive": 0
},
"up": [
7,
4
],
"acting": [
7,
4
],
"blocked_by": [
4
],
"up_primary": 7,
"acting_primary": 7
},
"empty": 0,
"dne": 0,
"incomplete": 0,
"last_epoch_started": 116841,
"hit_set_history": {
"current_last_update": "0'0",
"current_last_stamp": "0.000000",
"current_info": {
"begin": "0.000000",
"end": "0.000000",
"version": "0'0"
},
"history": []
}
},
"peer_info": [],
"recovery_state": [
{
"name": "Started\/Primary\/Peering\/GetInfo",
"enter_time": "2015-11-14 11:25:20.451888",
"requested_info_from": [
{
"osd": "4"
}
]
},
{
"name": "Started\/Primary\/Peering",
"enter_time": "2015-11-14 11:25:20.451882",
"past_intervals": [
{
"first": 116838,
"last": 120813,
"maybe_went_rw": 1,
"up": [
7,
4
],
"acting": [
7,
4
],
"primary": 7,
"up_primary": 7
},
{
"first": 120814,
"last": 120889,
"maybe_went_rw": 1,
"up": [
7,
4
],
"acting": [
7,
4
],
"primary": 7,
"up_primary": 7
},
{
"first": 120890,
"last": 126561,
"maybe_went_rw": 1,
"up": [
7,
4
],
"acting": [
7,
4
],
"primary": 7,
"up_primary": 7
}
],
"probing_osds": [
"4",
"7"
],
"down_osds_we_would_probe": [],
"peering_blocked_by": []
},
{
"name": "Started",
"enter_time": "2015-11-14 11:25:20.451851"
}
],
"agent_state": {}
}
On 15 November 2015 at 01:26, Peter Theobald <pete@xxxxxxxxxxxxxxx> wrote:
PeteRegardsTotal available space is about 24TB. Used space is 8TB at replication level of 2,Hi Gregory,This is the output of ceph -s
cluster 5400bbc9-378d-4c69-afc4-da71393f7baf
health HEALTH_WARN
82 pgs peering
82 pgs stuck inactive
82 pgs stuck unclean
1 requests are blocked > 32 sec
pool images pg_num 256 > pgp_num 128
monmap e2: 2 mons at {0=192.168.2.1:6789/0,1=192.168.2.3:6789/0}
election epoch 16, quorum 0,1 0,1
osdmap e168004: 9 osds: 9 up, 9 in; 4 remapped pgs
pgmap v1317963: 256 pgs, 1 pools, 4377 GB data, 1105 kobjects
8792 GB used, 15369 GB / 24162 GB avail
174 active+clean
78 peering
4 remapped+peeringOn 14 November 2015 at 18:03, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:What's the full output of "Ceph -s"? Are your new crush rules actually satisfiable? Is your cluster filling up?-Greg
On Saturday, November 14, 2015, Peter Theobald <pete@xxxxxxxxxxxxxxx> wrote:PeteRegardsHi list,I have a 3 node ceph cluster with a total of 9 ods (2,3 and 4 with different size drives). I changed the layout (failure domain from per osd to per host and changed min_size) and I now have a few pgs stuck in peering or remapped+peering for a couple of day now.The hosts are under powered. 2x hp microservers and a single i5 desktop grade machine so not super powerful. The network is fast though (bonded gb ethernet with dedicated switch).I'm concerned that the remapped+peering pgs are stuck. All the nodes in peering or remapped+peering are stuck inactive and unclean so i'm concerned about data loss. Do I just need to wait for them to fix themselves? I cannot see any mention of unfound objects when I query the remapped pgs so I think i'm ok and just need to be patient. I have 128 pgs across 9 osds so probably have a lot of objects per pg. Total data is about 4TB
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com