Hi,
thanks, but unfortunately it's not the thing I suspected :(
Anyways, there's something wrong with your snapshots, the log also contains a lot of entries like this:thanks, but unfortunately it's not the thing I suspected :(
2018-04-09 06:58:53.703353 7fb8931a0700 -1 osd.28 pg_epoch: 88438 pg[0.5d( v 88438'223279 (86421'221681,88438'223279] local-lis/les=87450/87451 n=5634 ec=115/115 lis/c 87450/87450 les/c/f 87451/87451/0 87352/87450/87450) [37,6,28] r=2 lpr=87450 luod=0'0 crt=88438'223279 lcod 88438'223278 active] _scan_snaps no head for 0:ba087b0f:::rbd_data.221bf2eb141f2.0000000000001436:46aa (have MIN)
The cluster I've debugged with the same crash also got a lot of snapshot problems including this one.
In the end, only manually marking all snap_ids as deleted in the pool helped.
2018-04-10 21:48 GMT+02:00 Jan Marquardt <jm@xxxxxxxxxxx>:
Am 10.04.18 um 20:22 schrieb Paul Emmerich:
> Hi,
>
> I encountered the same crash a few months ago, see
> https://tracker.ceph.com/issues/23030
>
> Can you post the output of
>
> ceph osd pool ls detail -f json-pretty
>
>
> Paul
Yes, of course.
# ceph osd pool ls detail -f json-pretty
[
{
"pool_name": "rbd",
"flags": 1,
"flags_names": "hashpspool",
"type": 1,
"size": 3,
"min_size": 2,
"crush_rule": 0,
"object_hash": 2,
"pg_num": 768,
"pg_placement_num": 768,
"crash_replay_interval": 0,
"last_change": "91256",
"last_force_op_resend": "0",
"last_force_op_resend_preluminous": "0",
"auid": 0,
"snap_mode": "selfmanaged",
"snap_seq": 35020,
"snap_epoch": 91219,
"pool_snaps": [],
"removed_snaps":
"[1~4562,47f1~58,484a~9,4854~70,48c5~36,48fc~48,4945~d, 4953~1,4957~1,495a~3,4960~1, 496e~3,497a~1,4980~2,4983~3, 498b~1,4997~1,49a8~1,49ae~1, 49b1~2,49b4~1,49b7~1,49b9~3, 49bd~5,49c3~6,49ca~5,49d1~4, 49d6~1,49d8~2,49df~2,49e2~1, 49e4~2,49e7~5,49ef~2,49f2~2, 49f5~6,49fc~1,49fe~3,4a05~9, 4a0f~4,4a14~4,4a1a~6,4a21~6, 4a29~2,4a2c~3,4a30~1,4a33~5, 4a39~3,4a3e~b,4a4a~1,4a4c~2, 4a50~1,4a52~7,4a5a~1,4a5c~2, 4a5f~4,4a64~1,4a66~2,4a69~2, 4a6c~4,4a72~1,4a74~2,4a78~3, 4a7c~6,4a84~2,4a87~b,4a93~4, 4a99~1,4a9c~4,4aa1~7,4aa9~1, 4aab~6,4ab2~2,4ab5~5,4abb~2, 4abe~9,4ac8~a,4ad3~4,4ad8~13, 4aec~16,4b03~6,4b0a~c,4b17~2, 4b1a~3,4b1f~4,4b24~c,4b31~d, 4b3f~13,4b53~1,4bfc~13ed,61e1~ 4a,622c~8,6235~a0,62d6~ac, 63a6~2,63b2~2,63d0~2,63f7~2, 6427~2,6434~10f]",
"quota_max_bytes": 0,
"quota_max_objects": 0,
"tiers": [],
"tier_of": -1,
"read_tier": -1,
"write_tier": -1,
"cache_mode": "none",
"target_max_bytes": 0,
"target_max_objects": 0,
"cache_target_dirty_ratio_micro": 0,
"cache_target_dirty_high_ratio_micro": 0,
"cache_target_full_ratio_micro": 0,
"cache_min_flush_age": 0,
"cache_min_evict_age": 0,
"erasure_code_profile": "",
"hit_set_params": {
"type": "none"
},
"hit_set_period": 0,
"hit_set_count": 0,
"use_gmt_hitset": true,
"min_read_recency_for_promote": 0,
"min_write_recency_for_promote": 0,
"hit_set_grade_decay_rate": 0,
"hit_set_search_last_n": 0,
"grade_table": [],
"stripe_width": 0,
"expected_num_objects": 0,
"fast_read": false,
"options": {},
"application_metadata": {
"rbd": {}
}
}
]
"Unfortunately" I started the crashed OSDs again in the meantime,
because the first pgs have been down before. So currently all OSDs are
running.
Regards,
Jan
--
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com