Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

Lincoln Bryant <lincolnb@xxxxxxxxxxxx> · Thu, 27 Feb 2020 18:05:49 +0000

It seems that one of the down PGs was able to recover just fine, but the other OSD went into "incomplete" state after export-and-removing the affected PG from the down OSD.

I've still got the exported data from the pg, although re-importing it to the OSD again causes the crashes.

What's the best way to proceed here? It would be very very preferable not to lose the data.

Thanks,
Lincoln
________________________________
From: Lincoln Bryant <lincolnb@xxxxxxxxxxxx>
Sent: Thursday, February 27, 2020 9:26 AM
To: Sage Weil <sage@xxxxxxxxxxxx>; Paul Emmerich <paul.emmerich@xxxxxxxx>
Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
Subject:  Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

Thanks Sage, I can try that. Admittedly I'm not sure how to tell if these two PG can recover without this particular OSD.

Note, it seems like there is still an underlying related issue, with hit set archives popping up as unfound objects on my cluster as in Paul's ticket. In total I had about 10-20 such objects in my cluster, most of which were ok with marking them as unfound/lost. I've had this happen once before a few months ago with unfound hit set archives but was able to safely mark the object lost and move on (or so I thought..?).

--Lincoln
________________________________
From: Sage Weil <sage@xxxxxxxxxxxx>
Sent: Thursday, February 27, 2020 9:01 AM
To: Paul Emmerich <paul.emmerich@xxxxxxxx>
Cc: Lincoln Bryant <lincolnb@xxxxxxxxxxxx>; ceph-users@xxxxxxx <ceph-users@xxxxxxx>
Subject: Re:  Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

If the pg in question can recover without that OSD, I would use
use ceph-objectstore-tool to export and remove it, and then move on.

I hit a similar issue on my system (due to a bunch in an early octopus
build) and it was super tedious to fix up manually (needed patched
code and manual modifications of the store).

sage

On Thu, 27 Feb 2020, Paul Emmerich wrote:

> Crash happens in PG::activate, so it's unrelated to IO etc.
>
> My first approach here would be to read the code and try to understand
> why it crashes/what the exact condition is that is violated here.
> It looks like something that can probably be fixed by fiddling around
> with ceph-objectstore-tool (but you should try to understand what
> exactly is happening before running random ceph-objectstore-tool
> commands)
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io<http://www.croit.io>
> Tel: +49 89 1896585 90
>
> On Thu, Feb 27, 2020 at 1:15 PM Lincoln Bryant <lincolnb@xxxxxxxxxxxx> wrote:
> >
> > Thanks Paul.
> >
> > I was able to mark many of the unfound ones as lost, but I'm still stuck with one unfound and OSD assert at this point.
> >
> > I've tried setting many of the OSD options to pause all cluster I/O, backfilling, rebalancing, tiering agent, etc to try to avoid hitting the assert but alas this one OSD is still crashing. The OSD in question does manage to log quite a bit of things before crashing.
> >
> > Is there any way for me to delete this or create a dummy object in RADOS that will let this OSD come up, I wonder?
> >
> > --Lincoln
> >
> > OBJECT_UNFOUND 1/793053192 objects unfound (0.000%)
> >     pg 36.1755 has 1 unfound objects
> > PG_AVAILABILITY Reduced data availability: 2 pgs inactive, 2 pgs down
> >     pg 36.1153 is down+remapped, acting [299]
> >     pg 36.2047 is down+remapped, acting [242]
> >
> >     -2> 2020-02-27 06:13:12.265 7f0824f1c700  0 0x55ed866481e0 36.2047 unexpected need for 36:e2040000:.ceph-internal::hit_set_36.2047_archive_2020-02-25 19%3a32%3a07.171593_2020-02-25 21%3a27%3a36.268116:head have 1363674'2866712 f
> > lags = none tried to add 1365222'2867906(1363674'2866712) flags = delete
> >
> >
> > ________________________________
> > From: Paul Emmerich <paul.emmerich@xxxxxxxx>
> > Sent: Thursday, February 27, 2020 5:27 AM
> > To: Lincoln Bryant <lincolnb@xxxxxxxxxxxx>
> > Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
> > Subject: Re:  Cache tier OSDs crashing due to unfound hitset object 14.2.7
> >
> > I've also encountered this issue, but luckily without the crashing
> > OSDs, so marking as lost resolved it for us.
> >
> > See https://tracker.ceph.com/issues/44286
> >
> >
> > Paul
> >
> > --
> > Paul Emmerich
> >
> > Looking for help with your Ceph cluster? Contact us at https://croit.io
> >
> > croit GmbH
> > Freseniusstr. 31h
> > 81247 München
> > www.croit.io<http://www.croit.io>
> > Tel: +49 89 1896585 90
> >
> > On Thu, Feb 27, 2020 at 6:02 AM Lincoln Bryant <lincolnb@xxxxxxxxxxxx> wrote:
> > >
> > > Hello Ceph experts,
> > >
> > > In the last day or so, we had a few nodes randomly reboot and now unfound objects are reported in Ceph health during cluster during recovery.
> > >
> > > It appears that the object in question is a hit set object, which I now cannot mark lost because Ceph cannot probe the OSDs that keep crashing due to missing the hit set object.
> > >
> > > Pasted below is the crash message[1] for osd.299, and some of the unfound objects[2]. Lastly [3] shows a sample of the hit set objects that are lost.
> > >
> > > I would greatly appreciate any insight you may have on how to move forward. As of right now this cluster is inoperable due to 3 down PGs.
> > >
> > > Thanks,
> > > Lincoln Bryant
> > >
> > >
> > > [1]
> > >    -4> 2020-02-26 22:26:29.455 7ff52edaa700  0 0x559587fa91e0 36.321b unexpected need for 36:d84c0000:.ceph-internal::hit_set_36.321b_archive_2020-02-24 21%3a15%3a16.792846_2020-02-24 21%3a15%3a32.457855:head have 1352209'2834660 flags = none tried to add 1352209'2834660 flags = none
> > >     -3> 2020-02-26 22:26:29.455 7ff52edaa700  0 0x559587fa91e0 36.321b unexpected need for 36:d84c0000:.ceph-internal::hit_set_36.321b_archive_2020-02-24 21%3a15%3a16.792846_2020-02-24 21%3a15%3a32.457855:head have 1352209'2834660 flags = none tried to add 1359781'2835659 flags = delete
> > >     -2> 2020-02-26 22:26:29.456 7ff53adc2700  3 osd.299 1367392 handle_osd_map epochs [1367392,1367392], i have 1367392, src has [1349017,1367392]
> > >     -1> 2020-02-26 22:26:29.460 7ff52edaa700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.7/rpm/el7/BUILD/ceph-14.2.7/src/osd/PG.h: In function 'void PG::MissingLoc::add_active_missing(const pg_missing_t&)' thread 7ff52edaa700 time 2020-02-26 22:26:29.457170
> > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.7/rpm/el7/BUILD/ceph-14.2.7/src/osd/PG.h: 838: FAILED ceph_assert(i->second.need == j->second.need)
> > >
> > >  ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)
> > >  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x55955fdafc0f]
> > >  2: (()+0x4dddd7) [0x55955fdafdd7]
> > >  3: (PG::MissingLoc::add_active_missing(pg_missing_set<false> const&)+0x1e0) [0x55955ffa0cb0]
> > >  4: (PG::activate(ObjectStore::Transaction&, unsigned int, std::map<int, std::map<spg_t, pg_query_t, std::less<spg_t>, std::allocator<std::pair<spg_t const, pg_query_t> > >, std::less<int>, std::allocator<std::pair<int const, std::map<spg_t, pg_query_t, std::less<spg_t>, std::allocator<std::pair<spg_t const, pg_query_t> > > > > >&, std::map<int, std::vector<std::pair<pg_notify_t, PastIntervals>, std::allocator<std::pair<pg_notify_t, PastIntervals> > >, std::less<int>, std::allocator<std::pair<int const, std::vector<std::pair<pg_notify_t, PastIntervals>, std::allocator<std::pair<pg_notify_t, PastIntervals> > > > > >*, PG::RecoveryCtx*)+0x1916) [0x55955ff3f1e6]
> > >  5: (PG::RecoveryState::Active::Active(boost::statechart::state<PG::RecoveryState::Active, PG::RecoveryState::Primary, PG::RecoveryState::Activating, (boost::statechart::history_mode)0>::my_context)+0x370) [0x55955ff62d20]
> > >  6: (boost::statechart::simple_state<PG::RecoveryState::Peering, PG::RecoveryState::Primary, PG::RecoveryState::GetInfo, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0xfb) [0x55955ffa8d5b]
> > >  7: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_queued_events()+0x97) [0x55955ff88507]
> > >  8: (PG::handle_activate_map(PG::RecoveryCtx*)+0x1a8) [0x55955ff75848]
> > >  9: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*)+0x61d) [0x55955feb161d]
> > >  10: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0xa6) [0x55955feb2d16]
> > >  11: (PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x51) [0x55956011a481]
> > >  12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x90f) [0x55955fea7bbf]
> > >  13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b6) [0x559560448976]
> > >  14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55956044b490]
> > >  15: (()+0x7e25) [0x7ff5669bae25]
> > >  16: (clone()+0x6d) [0x7ff565a9a34d]
> > >
> > >      0> 2020-02-26 22:26:29.465 7ff52edaa700 -1 *** Caught signal (Aborted) **
> > >  in thread 7ff52edaa700 thread_name:tp_osd_tp
> > >
> > >  ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)
> > >  1: (()+0xf5e0) [0x7ff5669c25e0]
> > >  2: (gsignal()+0x37) [0x7ff5659d71f7]
> > >  3: (abort()+0x148) [0x7ff5659d88e8]
> > >  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x55955fdafc5e]
> > >  5: (()+0x4dddd7) [0x55955fdafdd7]
> > >  6: (PG::MissingLoc::add_active_missing(pg_missing_set<false> const&)+0x1e0) [0x55955ffa0cb0]
> > >  7: (PG::activate(ObjectStore::Transaction&, unsigned int, std::map<int, std::map<spg_t, pg_query_t, std::less<spg_t>, std::allocator<std::pair<spg_t const, pg_query_t> > >, std::less<int>, std::allocator<std::pair<int const, std::map<spg_t, pg_query_t, std::less<spg_t>, std::allocator<std::pair<spg_t const, pg_query_t> > > > > >&, std::map<int, std::vector<std::pair<pg_notify_t, PastIntervals>, std::allocator<std::pair<pg_notify_t, PastIntervals> > >, std::less<int>, std::allocator<std::pair<int const, std::vector<std::pair<pg_notify_t, PastIntervals>, std::allocator<std::pair<pg_notify_t, PastIntervals> > > > > >*, PG::RecoveryCtx*)+0x1916) [0x55955ff3f1e6]
> > >  8: (PG::RecoveryState::Active::Active(boost::statechart::state<PG::RecoveryState::Active, PG::RecoveryState::Primary, PG::RecoveryState::Activating, (boost::statechart::history_mode)0>::my_context)+0x370) [0x55955ff62d20]
> > >  9: (boost::statechart::simple_state<PG::RecoveryState::Peering, PG::RecoveryState::Primary, PG::RecoveryState::GetInfo, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0xfb) [0x55955ffa8d5b]
> > >  10: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_queued_events()+0x97) [0x55955ff88507]
> > >  11: (PG::handle_activate_map(PG::RecoveryCtx*)+0x1a8) [0x55955ff75848]
> > >  12: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*)+0x61d) [0x55955feb161d]
> > >  13: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0xa6) [0x55955feb2d16]
> > >  14: (PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x51) [0x55956011a481]
> > >  15: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x90f) [0x55955fea7bbf]
> > >  16: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b6) [0x559560448976]
> > >  17: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55956044b490]
> > >  18: (()+0x7e25) [0x7ff5669bae25]
> > >  19: (clone()+0x6d) [0x7ff565a9a34d]
> > >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
> > >
> > > --- logging levels ---
> > >    0/ 5 none
> > >    0/ 1 lockdep
> > >    0/ 1 context
> > >    1/ 1 crush
> > >    1/ 5 mds
> > >    1/ 5 mds_balancer
> > >    1/ 5 mds_locker
> > >    1/ 5 mds_log
> > >    1/ 5 mds_log_expire
> > >    1/ 5 mds_migrator
> > >    0/ 1 buffer
> > >    0/ 1 timer
> > >    0/ 1 filer
> > >    0/ 1 striper
> > >    0/ 1 objecter
> > >    0/ 5 rados
> > >    0/ 5 rbd
> > >    0/ 5 rbd_mirror
> > >    0/ 5 rbd_replay
> > >    0/ 5 journaler
> > >    0/ 5 objectcacher
> > >    0/ 5 client
> > >    1/ 5 osd
> > >    0/ 5 optracker
> > >    0/ 5 objclass
> > >    1/ 3 filestore
> > >    1/ 3 journal
> > >    0/ 0 ms
> > >    1/ 5 mon
> > >    0/10 monc
> > >    1/ 5 paxos
> > >    0/ 5 tp
> > >    1/ 5 auth
> > >    1/ 5 crypto
> > >    1/ 1 finisher
> > >    1/ 1 reserver
> > >    1/ 5 heartbeatmap
> > >    1/ 5 perfcounter
> > >    1/ 5 rgw
> > >    1/ 5 rgw_sync
> > >    1/10 civetweb
> > >    1/ 5 javaclient
> > >    1/ 5 asok
> > >    1/ 1 throttle
> > >    0/ 0 refs
> > >    1/ 5 xio
> > >    1/ 5 compressor
> > >    1/ 5 bluestore
> > >    1/ 5 bluefs
> > >    1/ 3 bdev
> > >    1/ 5 kstore
> > >    4/ 5 rocksdb
> > >    4/ 5 leveldb
> > >    4/ 5 memdb
> > >    1/ 5 kinetic
> > >    1/ 5 fuse
> > >    1/ 5 mgr
> > >    1/ 5 mgrc
> > >    1/ 5 dpdk
> > >    1/ 5 eventtrace
> > >    1/ 5 prioritycache
> > >   -2/-2 (syslog threshold)
> > >   -1/-1 (stderr threshold)
> > >   max_recent     10000
> > >   max_new         1000
> > >   log_file /var/log/ceph/ceph-osd.299.log
> > > --- end dump of recent events ---
> > >
> > >
> > > [2]
> > > [root@ceph-mon01 ~]# ceph pg 36.321b list_unfound
> > > {
> > >     "num_missing": 1,
> > >     "num_unfound": 1,
> > >     "objects": [
> > >         {
> > >             "oid": {
> > >                 "oid": "hit_set_36.321b_archive_2020-02-24 21:15:16.792846_2020-02-24 21:15:32.457855",
> > >                 "key": "",
> > >                 "snapid": -2,
> > >                 "hash": 12827,
> > >                 "max": 0,
> > >                 "pool": 36,
> > >                 "namespace": ".ceph-internal"
> > >             },
> > >             "need": "1352209'2834660",
> > >             "have": "0'0",
> > >             "flags": "none",
> > >             "locations": []
> > >         }
> > >     ],
> > >     "more": false
> > > }
> > > [root@ceph-mon01 ~]# ceph pg 36.324a list_unfound
> > > {
> > >     "num_missing": 1,
> > >     "num_unfound": 1,
> > >     "objects": [
> > >         {
> > >             "oid": {
> > >                 "oid": "hit_set_36.324a_archive_2020-02-25 12:40:58.130723_2020-02-25 12:46:25.260587",
> > >                 "key": "",
> > >                 "snapid": -2,
> > >                 "hash": 12874,
> > >                 "max": 0,
> > >                 "pool": 36,
> > >                 "namespace": ".ceph-internal"
> > >             },
> > >             "need": "1361100'2822063",
> > >             "have": "0'0",
> > >             "flags": "none",
> > >             "locations": []
> > >         }
> > >     ],
> > >     "more": false
> > > }
> > > [root@ceph-mon01 ~]# ceph pg 36.10dc list_unfound
> > > {
> > >     "num_missing": 1,
> > >     "num_unfound": 1,
> > >     "objects": [
> > >         {
> > >             "oid": {
> > >                 "oid": "hit_set_36.10dc_archive_2020-02-25 12:40:58.129048_2020-02-25 12:45:02.202268",
> > >                 "key": "",
> > >                 "snapid": -2,
> > >                 "hash": 4316,
> > >                 "max": 0,
> > >                 "pool": 36,
> > >                 "namespace": ".ceph-internal"
> > >             },
> > >             "need": "1361089'2838543",
> > >             "have": "0'0",
> > >             "flags": "none",
> > >             "locations": []
> > >         }
> > >     ],
> > >     "more": false
> > > }
> > >
> > >
> > >
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx