Ceph Jewel crash on PG snaps

Wido den Hollander <wido@xxxxxxxx> · Wed, 9 Jan 2019 17:39:25 +0100

Hi,

Yes, I am aware that Jewel is EOL :-)

On a Jewel cluster I'm seeing OSDs crash shortly after they start with
something similar as this issue: http://tracker.ceph.com/issues/15017

This cluster is running Jewel 10.2.11 and I'm seeing exactly the same
crash happening on Placement Groups which belong to cache tiering pools
(57 and 65).

Looking at it with GDB it crashes in osd/ReplicatedPG.cc on this line:

 last_clone_oid.snap = ctx->new_snapset.clone_overlap.rbegin()->first;

I am not very familiar with the Snapshotting and PGs mechanism and
before I would attempt an upgrade to Luminous I rather debug this.

    -3> 2019-01-09 16:37:13.002666 7f1dcb7b3700 10 osd.6 pg_epoch:
612210 pg[65.243( v 612202'563066013 lc 611312'563065968
(609463'563062792,612202'563066013] local-les=612210 n=11511 ec=149533
les/c/f 612210/611807/0 612208/612209/612209) [6,786]/[6,928] r=0
lpr=612209 pi=609825-612208/51 bft=786 crt=612202'563066013 lcod 0'0
mlcod 0'0 active+recovery_wait+undersized+degraded+remapped NIBBLEWISE
m=17] do_osd_op  delete
    -2> 2019-01-09 16:37:13.002684 7f1dcb7b3700 20 osd.6 pg_epoch:
612210 pg[65.243( v 612202'563066013 lc 611312'563065968
(609463'563062792,612202'563066013] local-les=612210 n=11511 ec=149533
les/c/f 612210/611807/0 612208/612209/612209) [6,786]/[6,928] r=0
lpr=612209 pi=609825-612208/51 bft=786 crt=612202'563066013 lcod 0'0
mlcod 0'0 active+recovery_wait+undersized+degraded+remapped NIBBLEWISE
m=17] _delete_oid setting whiteout on
65:c2607a60:::rbd_data.e53c3c27c0089c.000000000000009e:head
    -1> 2019-01-09 16:37:13.002705 7f1dcb7b3700 20 osd.6 pg_epoch:
612210 pg[65.243( v 612202'563066013 lc 611312'563065968
(609463'563062792,612202'563066013] local-les=612210 n=11511 ec=149533
les/c/f 612210/611807/0 612208/612209/612209) [6,786]/[6,928] r=0
lpr=612209 pi=609825-612208/51 bft=786 crt=612202'563066013 lcod 0'0
mlcod 0'0 active+recovery_wait+undersized+degraded+remapped NIBBLEWISE
m=17] make_writeable
65:c2607a60:::rbd_data.e53c3c27c0089c.000000000000009e:head
snapset=0x7f1e06349bb8  snapc=a4=[]
     0> 2019-01-09 16:37:13.005749 7f1dcb7b3700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7f1dcb7b3700 thread_name:tp_osd_tp

 ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)
 1: (()+0x9f1c2a) [0x7f1deed72c2a]
 2: (()+0xf100) [0x7f1decab5100]
 3: (()+0x7537a) [0x7f1deb99137a]
 4: (ReplicatedPG::make_writeable(ReplicatedPG::OpContext*)+0x138)
[0x7f1dee90ec28]
 5: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x56b)
[0x7f1dee9106db]
 6: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x920)
[0x7f1dee911110]
 7: (ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)+0x2843)
[0x7f1dee915083]
 8: (ReplicatedPG::do_request(std::shared_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x747) [0x7f1dee8d0b67]
 9: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x41d) [0x7f1dee780cdd]
 10: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>&)+0x6d)
[0x7f1dee780f2d]
 11: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0x869) [0x7f1dee784a09]
 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x887)
[0x7f1deee61b07]
 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f1deee63a70]
 14: (()+0x7dc5) [0x7f1decaaddc5]
 15: (clone()+0x6d) [0x7f1deb138ced]

Has anybody seen this before? Right now there are multiple PGs in the
cache tier which are down and I'm trying to fix it.

Maybe somebody has seen this before and knows how to fix it.

Thanks,

Wido