Hi list, A few days ago we had some problems with our ceph cluster, and now we have some OSDs crashing on start with messages like this right before crashing: 2017-06-09 15:35:02.226430 7fb46d9e4700 -1 log_channel(cluster) log [ERR] : trim_object Snap 4aae0 not in clones I can start those OSDs if i set 'osd recovery max active = 0', but then the PGs in those OSDs stay in stuck degraded/unclean state. Now, one of the problems we faced was missing clones, which we solved with ceph-objectstore-tool's remove-clone-metadata option, but it doesn't seem to work in this case (PG and object taken from the sample log posted below): # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-199 --journal-path /var/lib/ceph/osd/ceph-199/journal --pgid 3.1cd rbd_data.6e30d2d518fee6d.000000000000008f remove-clone-metadata $((0x4aae0)) Clone 4aae0 not present Any hints how can we debug/fix this problem? Thanks in advance. Our cluster: Hammer (0.94.10) 5 mons 576 osds Our clients: Openstack, some Hammer (0.94.10), some Giant (0.87.2) (We're in the process of upgrading everithing to Hhammer and then to Jewel) Our ceph.conf: [global] fsid = 5ecc1509-4344-4087-9a1c-ac68fb085a75 mon initial members = ceph17, ceph19, ceph20, cephmon01, cephmon02 mon host = 172.17.22.17,172.17.22.19,172.17.22.20,172.17.23.251,172.17.23.252 auth cluster required = cephx auth service required = cephx auth client required = cephx filestore xattr use omap = true osd pool default size = 2 osd pool default min size = 1 public network = 172.17.0.0/16 cluster network = 172.18.0.0/16 osd pool default pg num = 2048 osd pool default pgp num = 2048 mon osd down out subtree limit = host rbd default format = 2 osd op threads = 2 osd disk threads = 1 osd max backfills = 2 osd recovery threads = 1 osd recovery max active = 2 osd recovery op priority = 2 [mon] mon compact on start = true [osd] osd crush update on start = false Here's a sample log, with [...] replacing very long text/lines, but I can provide full logs if needed: 2017-06-08 16:47:14.519092 7fa61bee4700 -1 log_channel(cluster) log [ERR] : trim_object Snap 4aae0 not in clones 2017-06-08 16:47:16.197600 7fa62ee75700 0 osd.199 pg_epoch: 1779479 pg[3.1cd( v 1779479'105504641 (1779407'105501638,1779479'105504641] local-les=1779478 n=1713 ec=33 les/c 1779478/1779479 1779477/1779477/1779477) [199,134] r=0 lpr=1779477 luod=1779479'105504639 crt=1779479'105504637 lcod 1779479'105504638 mlcod 1779479'105504638 active+clean snaptrimq=[4aae0~1,4aae2~1,4aae4~1,4aae6~1,[...],4bc5f~4,4bc64~2,4bc68~2]] removing repgather(0xf9d09c0 1779479'105504639 rep_tid=1753 committed?=1 applied?=1 lock=0 op=osd_op(client.215404673.0:6234090 rbd_data.6e30d2d518fee6d.000000000000008f [set-alloc-hint object_size 8388608 write_size 8388608,write 2101248~4096] 3.56a5e1cd snapc 4a86e=[] ack+ondisk+write+known_if_redirected e1779479) v4) 2017-06-08 16:47:16.197834 7fa62ee75700 0 osd.199 pg_epoch: 1779479 pg[3.1cd( v 1779479'105504641 (1779407'105501638,1779479'105504641] local-les=1779478 n=1713 ec=33 les/c 1779478/1779479 1779477/1779477/1779477) [199,134] r=0 lpr=1779477 luod=1779479'105504639 crt=1779479'105504637 lcod 1779479'105504638 mlcod 1779479'105504638 active+clean snaptrimq=[4aae0~1,4aae2~1,4aae4~1,[...],4bc5f~4,4bc64~2,4bc68~2]] q front is repgather(0x10908a00 0'0 rep_tid=1740 committed?=0 applied?=0 lock=0) 2017-06-08 16:47:16.267386 7fa62ee75700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)' thread 7fa62ee75700 time 2017-06-08 16:47:16.197953 osd/ReplicatedPG.cc: 7477: FAILED assert(repop_queue.front() == repop) ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbdf735] 2: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x12e1) [0x853111] 3: (ReplicatedPG::repop_all_committed(ReplicatedPG::RepGather*)+0xcc) [0x85341c] 4: (Context::complete(int)+0x9) [0x6c6329] 5: (ReplicatedBackend::op_commit(ReplicatedBackend::InProgressOp*)+0x1dc) [0xa11b4c] 6: (Context::complete(int)+0x9) [0x6c6329] 7: (ReplicatedPG::BlessedContext::finish(int)+0x94) [0x8ad2d4] 8: (Context::complete(int)+0x9) [0x6c6329] 9: (Finisher::finisher_thread_entry()+0x168) [0xb02168] 10: (()+0x7dc5) [0x7fa63ce39dc5] 11: (clone()+0x6d) [0x7fa63b91b76d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -10000> 2017-06-09 15:35:27.341080 7fb4862fb700 5 osd.199 1779827 tick -9999> 2017-06-09 15:35:27.341102 7fb4862fb700 20 osd.199 1779827 scrub_time_permit should run between 0 - 24 now 15 = yes -9998> 2017-06-09 15:35:27.341121 7fb4862fb700 20 osd.199 1779827 scrub_load_below_threshold loadavg 1.35 >= max 0.5 = no, load too high -9997> 2017-06-09 15:35:27.341130 7fb4862fb700 20 osd.199 1779827 sched_scrub load_is_low=0 [...] -5> 2017-06-09 15:35:30.894979 7fb480edd700 15 osd.199 pg_epoch: 1779827 pg[3.92f( v 1779827'125690903 (1779818'125687817,1779827'125690903] local-les=1779826 n=3186 ec=33 les/c 1779826/1779827 1779825/1779825/1779825) [199,564] r=0 lpr=1779825 lua=1779827'125690899 crt=1779827'125690898 lcod 1779827'125690902 mlcod 1779827'125690898 active+clean] publish_stats_to_osd 1779827:186528651 -4> 2017-06-09 15:35:30.894986 7fb480edd700 10 osd.199 pg_epoch: 1779827 pg[3.92f( v 1779827'125690903 (1779818'125687817,1779827'125690903] local-les=1779826 n=3186 ec=33 les/c 1779826/1779827 1779825/1779825/1779825) [199,564] r=0 lpr=1779825 lua=1779827'125690899 crt=1779827'125690898 lcod 1779827'125690902 mlcod 1779827'125690898 active+clean] sending commit on repgather(0xf05b600 1779827'125690903 rep_tid=1162 committed?=1 applied?=0 lock=0 op=osd_op(client.260852320.0:1973188 rbd_data.3d86d73dbe422e.00000000000000ef [set-alloc-hint object_size 8388608 write_size 8388608,write 3444736~4096] 3.ef04392f snapc 4abd1=[] ack+ondisk+write+known_if_redirected e1779827) v4) 0xf918c00 -3> 2017-06-09 15:35:30.895055 7fb480edd700 1 -- 172.17.22.31:6800/209017 --> 172.17.21.25:0/1016405 -- osd_op_reply(1973188 rbd_data.3d86d73dbe422e.00000000000000ef [set-alloc-hint object_size 8388608 write_size 8388608,write 3444736~4096] v1779827'125690903 uv125690903 ondisk = 0) v6 -- ?+0 0xf918c00 con 0xee5f080 -2> 2017-06-09 15:35:30.895066 7fb480edd700 5 -- op tracker -- seq: 3953, time: 2017-06-09 15:35:30.895066, event: commit_sent, op: osd_op(client.260852320.0:1973188 rbd_data.3d86d73dbe422e.00000000000000ef [set-alloc-hint object_size 8388608 write_size 8388608,write 3444736~4096] 3.ef04392f snapc 4abd1=[] ack+ondisk+write+known_if_redirected e1779827) -1> 2017-06-09 15:35:30.906430 7fb46d9e4700 10 osd.199 pg_epoch: 1779827 pg[3.506( v 1779827'139421210 (1779816'139418118,1779827'139421210] local-les=1779826 n=3131 ec=33 les/c 1779826/1779827 1779825/1779825/1779825) [199,171] r=0 lpr=1779825 crt=1779827'139421199 lcod 1779827'139421209 mlcod 1779827'139421209 active+clean+scrubbing+deep] be_deep_scrub 3/6e748506/rbd_data.9f52fa7790c71b.0000000000000122/head seed 4294967295 0> 2017-06-09 15:35:30.919706 7fb4816de700 -1 *** Caught signal (Aborted) ** in thread 7fb4816de700 ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af) 1: /usr/bin/ceph-osd() [0xadc182] 2: (()+0xf370) [0x7fb48ec9b370] 3: (gsignal()+0x37) [0x7fb48d6b31d7] 4: (abort()+0x148) [0x7fb48d6b48c8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fb48dfb79d5] 6: (()+0x5e946) [0x7fb48dfb5946] 7: (()+0x5e973) [0x7fb48dfb5973] 8: (()+0x5eb93) [0x7fb48dfb5b93] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27a) [0xbdf92a] 10: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x12e1) [0x853111] 11: (ReplicatedPG::repop_all_applied(ReplicatedPG::RepGather*)+0x195) [0x8532e5] 12: (Context::complete(int)+0x9) [0x6c6329] 13: (ReplicatedBackend::op_applied(ReplicatedBackend::InProgressOp*)+0x21c) [0xa118dc] 14: (Context::complete(int)+0x9) [0x6c6329] 15: (ReplicatedPG::BlessedContext::finish(int)+0x94) [0x8ad2d4] 16: (Context::complete(int)+0x9) [0x6c6329] 17: (void finish_contexts<Context>(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0xac) [0x709e3c] 18: (C_ContextsBase<Context, Context>::complete(int)+0x9) [0x6c6c19] 19: (Finisher::finisher_thread_entry()+0x168) [0xb02168] 20: (()+0x7dc5) [0x7fb48ec93dc5] 21: (clone()+0x6d) [0x7fb48d77576d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/20 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 keyvaluestore 1/ 3 journal 0/ 5 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 xio -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-osd.199.log --- end dump of recent events --- Cheers, -- Ricardo J. Barberis Usuario Linux Nº 250625: http://counter.li.org/ Usuario LFS Nº 5121: http://www.linuxfromscratch.org/ Senior SysAdmin / IT Architect - www.DonWeb.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com