Re: Ceph Luminous - pg is down due to src/osd/SnapMapper.cc: 246: FAILED assert(r == -2)

Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> · Mon, 12 Feb 2018 12:58:54 +0100

Hi Sage,

did you had any time to look at the logs?

Stefan

Am 05.02.2018 um 14:39 schrieb Stefan Priebe - Profihost AG:
> Hi Sage,
> 
> Am 02.02.2018 um 22:05 schrieb Sage Weil:
>> On Fri, 2 Feb 2018, Stefan Priebe - Profihost AG wrote:
>>> Am 02.02.2018 um 20:28 schrieb Sage Weil:
>>>> On Fri, 2 Feb 2018, Stefan Priebe - Profihost AG wrote:
>>>>> Hi Sage,
>>>>>
>>>>> some days ago i reverted to stock luminous with
>>>>> https://github.com/liewegas/ceph/commit/e6a04446d21689ae131b0b683b4d76c7c30d8e6a
>>>>> +
>>>>> https://github.com/liewegas/ceph/commit/e17493bdffd7b9a349f377e7877fac741221edfa
>>>>> on top. I scrubbed and deep-scrubbed all pgs before and no errors were
>>>>> found.
>>>>>
>>>>> this worked fine until the ceph balancer runned and started remapping
>>>>> some pgs.
>>>>>
>>>>> This again trigged this assert:
>>>>>
>>>>> /build/ceph/src/osd/SnapMapper.cc: In function 'int
>>>>> SnapMapper::get_snaps(const hobject_t&, SnapMapper::object_snaps*)'
>>>>> thread 7f4289bff700 time 2018-02-02 17:46:33.249579
>>>>> /build/ceph/src/osd/SnapMapper.cc: 154: FAILED assert(!out->snaps.empty())
>>>>>
>>>>> Any idea how to fix it?
>>>>
>>>> Cheap workaround is to comment out that assert.  I'm still tryign to sort 
>>>
>>> Yes i did but i feel unconforable with it as the snapmapper value is
>>> still wrong.
>>>
>>>> out how this happened, though.  Not sure if you shared this already, but:
>>>> - bluestore or filestore?
>>>
>>> As i first reported it only filestore. Now also some bluestore OSDs.
>>>
>>>> - no cache tiering, right?  nowhere on the cluster?
>>> Yes and Yes. Never used cache tiering at all.
>>>
>>>> It looks like the SnapMapper entry with an empty snaps vector came in via 
>>>> add_oid() (update_oid() asserts if the new_snaps is empty; add_oid() 
>>>> doesn't have that check).  There are a few place it could come from, 
>>>> including the on_local_recover() call and the update_snap_map() call.  
>>>> That last one is interesting because it pulls the snaps for a new clone 
>>>> out of the pg log entry.. maybe those aren't getting populated properly.
>>>
>>> Is it also possible that they got corrupted in the past and now ceph is
>>> unable to detect it and repair it with a scrub or deep scrub?
>>>
>>>> Can you confirm that all of the OSDs on the cluster are running the
>>>> latest?  (ceph versions).
>>>
>>> Yes they do - i always check this after each update and i rechecked it
>>> again.
>>
>> Ok thanks!
>>
>> I pushed a branch wip-snapmapper-debug to github.com/liewegas/ceph that 
>> has those commits above plus another one that adds debug messages in the 
>> various places that the empty snap vectors might be coming from.
>>
>> This won't help it get past the rebalance issue you hit above, but 
>> hopefully it will trigger when you create a new clone so we can see how 
>> this is happening.
>>
>> If we need to, I think we can ignore the assert (by commenting it out), 
>> but one downside there is that we'll see my warnings pop up when 
>> recovering those objects so we won't be able to tell where they really 
>> came from. Since you were seeing this on new snaps/clones, though, I'm 
>> hoping we won't end up in that situation!
> 
> the biggest part of the logs is now this stuff:
> 2018-02-05 07:56:32.712366 7fb6893ff700  0 log_channel(cluster) log
> [DBG] : 3.740 scrub ok
> 2018-02-05 08:00:16.033843 7fb68a7fe700 -1 osd.0 pg_epoch: 945191
> pg[3.bbf( v 945191'63020848 (945191'63019283,945191'63020848]
> local-lis/les=944620/944621 n=1612 ec=15/15 lis/c 944620/944620 les/c/f
> 944621/944621/937357 944620/944620/944507) [51,54,0] r=2 lpr=944620
> luod=0'0 crt=945191'63020848 lcod 945191'63020847 active] _scan_snaps no
> head for 3:fdd87083:::rbd_data.67b0ee6b8b4567.00000000000103a1:b0d77
> (have MIN)
> 2018-02-05 08:00:42.123804 7fb6833ff700 -1 osd.0 pg_epoch: 945191
> pg[3.bbf( v 945191'63020901 (945191'63019383,945191'63020901]
> local-lis/les=944620/944621 n=1612 ec=15/15 lis/c 944620/944620 les/c/f
> 944621/944621/937357 944620/944620/944507) [51,54,0] r=2 lpr=944620
> luod=0'0 crt=945191'63020901 lcod 945191'63020900 active] _scan_snaps no
> head for 3:fddb41cc:::rbd_data.9772fb6b8b4567.0000000000000380:b0d9f
> (have MIN)
> 2018-02-05 08:03:49.583574 7fb687bfe700 -1 osd.0 pg_epoch: 945191
> pg[3.caa( v 945191'52090546 (945191'52088948,945191'52090546]
> local-lis/les=944694/944695 n=1709 ec=15/15 lis/c 944694/944694 les/c/f
> 944695/944697/937357 944694/944694/944653) [70,0,93] r=1 lpr=944694
> luod=0'0 crt=945191'52090546 lcod 945191'52090545 active] _scan_snaps no
> head for 3:5530360d:::rbd_data.52a6596b8b4567.0000000000000929:b0d55
> (have MIN)
> 2018-02-05 08:03:50.432394 7fb687bfe700 -1 osd.0 pg_epoch: 945191
> pg[3.caa( v 945191'52090549 (945191'52089048,945191'52090549]
> local-lis/les=944694/944695 n=1709 ec=15/15 lis/c 944694/944694 les/c/f
> 944695/944697/937357 944694/944694/944653) [70,0,93] r=1 lpr=944694
> luod=0'0 crt=945191'52090549 lcod 945191'52090548 active] _scan_snaps no
> head for 3:553243fa:::rbd_data.67b0ee6b8b4567.0000000000009956:b0d77
> (have MIN)
> 2018-02-05 08:03:52.523643 7fb687bfe700 -1 osd.0 pg_epoch: 945191
> pg[3.caa( v 945191'52090549 (945191'52089048,945191'52090549]
> local-lis/les=944694/944695 n=1709 ec=15/15 lis/c 944694/944694 les/c/f
> 944695/944697/937357 944694/944694/944653) [70,0,93] r=1 lpr=944694
> luod=0'0 crt=945191'52090549 lcod 945191'52090548 active] _scan_snaps no
> head for 3:5536f170:::rbd_data.c1a3d66b8b4567.000000000000b729:b0dc5
> (have MIN)
> 2018-02-05 08:03:55.471369 7fb687bfe700 -1 osd.0 pg_epoch: 945191
> pg[3.caa( v 945191'52090549 (945191'52089048,945191'52090549]
> local-lis/les=944694/944695 n=1709 ec=15/15 lis/c 944694/944694 les/c/f
> 944695/944697/937357 944694/944694/944653) [70,0,93] r=1 lpr=944694
> luod=0'0 crt=945191'52090549 lcod 945191'52090548 active] _scan_snaps no
> head for 3:553d2114:::rbd_data.154a9e6b8b4567.0000000000001ea1:b0e4e
> (have MIN)
> 2018-02-05 08:08:55.440715 7fb687bfe700 -1 osd.0 pg_epoch: 945191
> pg[3.7aa( v 945191'144763900 (945191'144762311,945191'144763900]
> local-lis/les=944522/944523 n=1761 ec=15/15 lis/c 944522/944522 les/c/f
> 944523/944551/937357 944522/944522/944522) [39,37,0] r=2 lpr=944522
> luod=0'0 crt=945191'144763900 lcod 945191'144763899 active] _scan_snaps
> no head for 3:55e14d0f:::rbd_data.67b0ee6b8b4567.00000000000025f2:b0d77
> (have MIN)
> 2018-02-05 08:09:09.295844 7fb68efff700 -1 osd.0 pg_epoch: 945191
> pg[3.7aa( v 945191'144763970 (945191'144762411,945191'144763970]
> local-lis/les=944522/944523 n=1761 ec=15/15 lis/c 944522/944522 les/c/f
> 944523/944551/937357 944522/944522/944522) [39,37,0] r=2 lpr=944522
> luod=0'0 crt=945191'144763970 lcod 945191'144763969 active] _scan_snaps
> no head for 3:55e2d7a9:::rbd_data.e40a416b8b4567.000000000000576f:b0df3
> (have MIN)
> 2018-02-05 08:09:26.285493 7fb68efff700 -1 osd.0 pg_epoch: 945191
> pg[3.7aa( v 945191'144764043 (945191'144762511,945191'144764043]
> local-lis/les=944522/944523 n=1762 ec=15/15 lis/c 944522/944522 les/c/f
> 944523/944551/937357 944522/944522/944522) [39,37,0] r=2 lpr=944522
> luod=0'0 crt=945191'144764043 lcod 945191'144764042 active] _scan_snaps
> no head for 3:55e4cbad:::rbd_data.163e9c96b8b4567.000000000000202d:b0d42
> (have MIN)
> 2018-02-05 08:10:36.490452 7fb687bfe700 -1 osd.0 pg_epoch: 945191
> pg[3.7aa( v 945191'144764243 (945191'144762711,945191'144764243]
> local-lis/les=944522/944523 n=1762 ec=15/15 lis/c 944522/944522 les/c/f
> 944523/944551/937357 944522/944522/944522) [39,37,0] r=2 lpr=944522
> luod=0'0 crt=945191'144764243 lcod 945191'144764242 active] _scan_snaps
> no head for 3:55ece3c3:::rbd_data.e60f906b8b4567.00000000000000de:b0dd7
> (have MIN)
> 
> Found nothing else in the looks until now. Should i search for more OSDs
> or is this already something we should consider?
> 
> Stefan
> 
>> Thanks!
>> sage
>>
>>
>>
>>  > 
>>> Stefan
>>>
>>>
>>>>>
>>>>> Greets,
>>>>> Stefan
>>>>>
>>>>> Am 22.01.2018 um 20:01 schrieb Sage Weil:
>>>>>> On Mon, 22 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>> Hi Sage,
>>>>>>>
>>>>>>> thanks! That one fixed it! Now i'm just waiting if the log ever stops
>>>>>>> telling me:
>>>>>>> 2018-01-22 19:48:44.204004 osd.45 [ERR] osd.45 found snap mapper error
>>>>>>> on pg 3.33e oid
>>>>>>> 3:7cc2e79f:::rbd_data.52ab946b8b4567.00000000000043fb:b0d7c snaps in
>>>>>>> mapper: , oi: b0d7c...repaired
>>>>>>>
>>>>>>> Currently it seems it does an endless repair and never ends.
>>>>>>
>>>>>> On many objects or the same object?
>>>>>>
>>>>>> sage
>>>>>>
>>>>>>>
>>>>>>> Greets,
>>>>>>> Stefan
>>>>>>> Am 22.01.2018 um 15:30 schrieb Sage Weil:
>>>>>>>> On Mon, 22 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>>> Hi Sage,
>>>>>>>>>
>>>>>>>>> get_snaps returned the error as there are no snaps in the snapmapper. So
>>>>>>>>> it's OK to skip the assert.
>>>>>>>>>
>>>>>>>>> My biggest problem is now that i'm not able to run a scrub on all PGs to
>>>>>>>>> fix all those errors because in around 1-3% the osd seem to deadlock and
>>>>>>>>> needs to get killed to proceed.
>>>>>>>>
>>>>>>>> I believe this will fix it:
>>>>>>>>
>>>>>>>> https://github.com/liewegas/ceph/commit/wip-snapmapper
>>>>>>>>
>>>>>>>> Cherry-pick that and let me know?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> sage
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> The log output is always:
>>>>>>>>>     -3> 2018-01-22 14:02:41.725678 7f10025fe700  0 log_channel(cluster)
>>>>>>>>> log [WRN] : slow request 141.951820 seconds old, received at
>>>>>>>>>  2018-01-22 14:00:19.773706: osd_repop(client.142132381.0:1310126 3.a65
>>>>>>>>> e927921/927902 3:a65be829:::rbd_data.4fbc586b8b4567.00000000
>>>>>>>>> 00000157:head v 927921'526559757) currently commit_sent
>>>>>>>>>     -2> 2018-01-22 14:02:41.725679 7f10025fe700  0 log_channel(cluster)
>>>>>>>>> log [WRN] : slow request 141.643853 seconds old, received at
>>>>>>>>>  2018-01-22 14:00:20.081674: osd_repop(client.139148241.0:15454227 3.ff7
>>>>>>>>> e927921/927798 3:eff12952:::rbd_data.2494d06b8b4567.0000000
>>>>>>>>> 00000147d:head v 927921'29105383) currently commit_sent
>>>>>>>>>     -1> 2018-01-22 14:02:41.725684 7f10025fe700  0 log_channel(cluster)
>>>>>>>>> log [WRN] : slow request 141.313220 seconds old, received at
>>>>>>>>>  2018-01-22 14:00:20.412307: osd_repop(client.138984198.0:82792023 3.de6
>>>>>>>>> e927921/927725 3:67b9dc7d:::rbd_data.7750506b8b4567.0000000
>>>>>>>>> 000000aff:head v 927921'164552063) currently commit_sent
>>>>>>>>>      0> 2018-01-22 14:02:42.426910 7f0fc23fe700 -1 *** Caught signal
>>>>>>>>> (Aborted) **
>>>>>>>>>  in thread 7f0fc23fe700 thread_name:tp_osd_tp
>>>>>>>>>
>>>>>>>>>  ceph version 12.2.2-94-g92923ef
>>>>>>>>> (92923ef323d32d8321e86703ce1f9016f19472fb) luminous (stable)
>>>>>>>>>  1: (()+0xa4423c) [0x561d876ab23c]
>>>>>>>>>  2: (()+0xf890) [0x7f100b974890]
>>>>>>>>>  3: (pthread_cond_wait()+0xbf) [0x7f100b97104f]
>>>>>>>>>  4: (ObjectStore::apply_transactions(ObjectStore::Sequencer*,
>>>>>>>>> std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Tran
>>>>>>>>> saction> >&, Context*)+0x254) [0x561d8746ac54]
>>>>>>>>>  5: (ObjectStore::apply_transaction(ObjectStore::Sequencer*,
>>>>>>>>> ObjectStore::Transaction&&, Context*)+0x5c) [0x561d87199b2c]
>>>>>>>>>  6: (PG::_scan_snaps(ScrubMap&)+0x8f7) [0x561d871ebf77]
>>>>>>>>>  7: (PG::build_scrub_map_chunk(ScrubMap&, hobject_t, hobject_t, bool,
>>>>>>>>> unsigned int, ThreadPool::TPHandle&)+0x22f) [0x561d871ecc4f]
>>>>>>>>>  8: (PG::replica_scrub(boost::intrusive_ptr<OpRequest>,
>>>>>>>>> ThreadPool::TPHandle&)+0x624) [0x561d871ed574]
>>>>>>>>>  9: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>>>>>>>>> ThreadPool::TPHandle&)+0x8b6) [0x561d872ac756]
>>>>>>>>>  10: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>>>>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f7)
>>>>>>>>> [0x561d87138bc7]
>>>>>>>>>  11: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
>>>>>>>>> const&)+0x57) [0x561d873b0db7]
>>>>>>>>>  12: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>>>>> ceph::heartbeat_handle_d*)+0x108c) [0x561d87167d1c]
>>>>>>>>>  13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x88d)
>>>>>>>>> [0x561d876f42bd]
>>>>>>>>>  14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x561d876f6280]
>>>>>>>>>  15: (()+0x8064) [0x7f100b96d064]
>>>>>>>>>  16: (clone()+0x6d) [0x7f100aa6162d]
>>>>>>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>>>>>>> needed to interpret this.
>>>>>>>>>
>>>>>>>>> --- logging levels ---
>>>>>>>>>    0/ 5 none
>>>>>>>>>    0/ 0 lockdep
>>>>>>>>>    0/ 0 context
>>>>>>>>>    0/ 0 crush
>>>>>>>>>    0/ 0 mds
>>>>>>>>>    0/ 0 mds_balancer
>>>>>>>>>    0/ 0 mds_locker
>>>>>>>>>    0/ 0 mds_log
>>>>>>>>>    0/ 0 mds_log_expire
>>>>>>>>>    0/ 0 mds_migrator
>>>>>>>>>    0/ 0 buffer
>>>>>>>>>    0/ 0 timer
>>>>>>>>>    0/ 0 filer
>>>>>>>>>    0/ 1 striper
>>>>>>>>>    0/ 0 objecter
>>>>>>>>>    0/ 0 rados
>>>>>>>>>    0/ 0 rbd
>>>>>>>>>    0/ 5 rbd_mirror
>>>>>>>>>    0/ 5 rbd_replay
>>>>>>>>>    0/ 0 journaler
>>>>>>>>>    0/ 0 objectcacher
>>>>>>>>>    0/ 0 client
>>>>>>>>>    0/ 0 osd
>>>>>>>>>    0/ 0 optracker
>>>>>>>>>    0/ 0 objclass
>>>>>>>>>    0/ 0 filestore
>>>>>>>>>    0/ 0 journal
>>>>>>>>>    0/ 0 ms
>>>>>>>>>    0/ 0 mon
>>>>>>>>>    0/ 0 monc
>>>>>>>>>    0/ 0 paxos
>>>>>>>>>    0/ 0 tp
>>>>>>>>>    0/ 0 auth
>>>>>>>>>    1/ 5 crypto
>>>>>>>>>    0/ 0 finisher
>>>>>>>>>    1/ 1 reserver
>>>>>>>>>    0/ 0 heartbeatmap
>>>>>>>>>    0/ 0 perfcounter
>>>>>>>>>    0/ 0 rgw
>>>>>>>>>    1/10 civetweb
>>>>>>>>>    1/ 5 javaclient
>>>>>>>>>    0/ 0 asok
>>>>>>>>>    0/ 0 throttle
>>>>>>>>>    0/ 0 refs
>>>>>>>>>    1/ 5 xio
>>>>>>>>>    1/ 5 compressor
>>>>>>>>>    1/ 5 bluestore
>>>>>>>>>    1/ 5 bluefs
>>>>>>>>>    1/ 3 bdev
>>>>>>>>>    1/ 5 kstore
>>>>>>>>>    4/ 5 rocksdb
>>>>>>>>>    4/ 5 leveldb
>>>>>>>>>    4/ 5 memdb
>>>>>>>>>    1/ 5 kinetic
>>>>>>>>>    1/ 5 fuse
>>>>>>>>>    1/ 5 mgr
>>>>>>>>>    1/ 5 mgrc
>>>>>>>>>    1/ 5 dpdk
>>>>>>>>>    1/ 5 eventtrace
>>>>>>>>>   -2/-2 (syslog threshold)
>>>>>>>>>   -1/-1 (stderr threshold)
>>>>>>>>>   max_recent     10000
>>>>>>>>>   max_new         1000
>>>>>>>>>   log_file /var/log/ceph/ceph-osd.59.log
>>>>>>>>> --- end dump of recent events ---
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Greets,
>>>>>>>>> Stefan
>>>>>>>>>
>>>>>>>>> Am 21.01.2018 um 21:27 schrieb Sage Weil:
>>>>>>>>>> On Fri, 19 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>>>>> Hi Sage,
>>>>>>>>>>>
>>>>>>>>>>> thanks for your reply. I'll do this. What i still don't understand is
>>>>>>>>>>> the following and i hope you might have an idea:
>>>>>>>>>>> 1.) All the snaps mentioned here are fresh - i created them today
>>>>>>>>>>> running luminous? How can they be missing in snapmapper again?
>>>>>>>>>>
>>>>>>>>>> I'm not sure.  It might help if you can set debug_osd=1/10 (or 1/20) so 
>>>>>>>>>> that when you hit the crash we have some context as to what is going on.
>>>>>>>>>>
>>>>>>>>>> Did you track down which part of get_snaps() is returning the error?
>>>>>>>>>>
>>>>>>>>>>> 2.) How did this happen? The cluster was jewel and was updated to
>>>>>>>>>>> luminous when all this happened.
>>>>>>>>>>
>>>>>>>>>> Did this start right after the upgrade?
>>>>>>>>>>
>>>>>>>>>> I started a PR that relaxes some of these assertions so that they clean up 
>>>>>>>>>> instead (unless the debug option is enabled, for our regression testing), 
>>>>>>>>>> but I'm still unclear about what the inconsistency is... any loggin you 
>>>>>>>>>> can provide would help!
>>>>>>>>>>
>>>>>>>>>> Thanks-
>>>>>>>>>> sage
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Greets,
>>>>>>>>>>> Stefan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Am 19.01.2018 um 21:19 schrieb Sage Weil:
>>>>>>>>>>>> I'm guessing it is coming from
>>>>>>>>>>>>
>>>>>>>>>>>>   object_snaps out;
>>>>>>>>>>>>   int r = get_snaps(oid, &out);
>>>>>>>>>>>>   if (r < 0)
>>>>>>>>>>>>     return r;
>>>>>>>>>>>>
>>>>>>>>>>>> in SnapMapper::update_snaps().  I would add a debug statement to to 
>>>>>>>>>>>> confirm that, and possibly also add debug lines to the various return 
>>>>>>>>>>>> points in get_snaps() so you can see why get_snaps is returning an error.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm guessing the way to proceed is to warn but ignore the error, but it's 
>>>>>>>>>>>> hard to say without seeing what the inconsistency there is and whether we 
>>>>>>>>>>>> can tolerate it...
>>>>>>>>>>>>
>>>>>>>>>>>> sage
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, 19 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> HI Sage,
>>>>>>>>>>>>>
>>>>>>>>>>>>> any ideas for this one sawing while doing a deep scrub:
>>>>>>>>>>>>>
>>>>>>>>>>>>>     -1> 2018-01-19 21:14:20.474287 7fbcdeffe700 -1 osd.5 pg_epoch:
>>>>>>>>>>>>> 925542 pg[3.306( v 925542'78349255 (925334'78347749,925542'78349255]
>>>>>>>>>>>>> local-lis/les=925503/925504 n=1752 ec=15/15 lis/c 925503/925503 les/c/f
>>>>>>>>>>>>> 925504/925517/0 925503/925503/925503) [75,31,5] r=2 lpr=925503 luod=0'0
>>>>>>>>>>>>> crt=925542'78349255 lcod 925542'78349254 active] _scan_snaps no head for
>>>>>>>>>>>>> 3:60c0267d:::rbd_data.103fdc6b8b4567.0000000000004c8e:b0ce7 (have MIN)
>>>>>>>>>>>>>
>>>>>>>>>>>>>      0> 2018-01-19 21:14:20.477650 7fbcd3bfe700 -1
>>>>>>>>>>>>> /build/ceph/src/osd/PG.cc: In function 'void PG::update_snap_map(const
>>>>>>>>>>>>> std::vector<pg_log_entry_t>&, ObjectStore::Transaction&)' thread
>>>>>>>>>>>>> 7fbcd3bfe700 time 2018-01-19 21:14:20.474752
>>>>>>>>>>>>> /build/ceph/src/osd/PG.cc: 3420: FAILED assert(r == 0)
>>>>>>>>>>>>>
>>>>>>>>>>>>>  ceph version 12.2.2-94-g92923ef
>>>>>>>>>>>>> (92923ef323d32d8321e86703ce1f9016f19472fb) luminous (stable)
>>>>>>>>>>>>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>>>>>>>>>>> const*)+0x102) [0x55c0100d0372]
>>>>>>>>>>>>>  2: (PG::update_snap_map(std::vector<pg_log_entry_t,
>>>>>>>>>>>>> std::allocator<pg_log_entry_t> > const&,
>>>>>>>>>>>>> anObjectStore::Transaction&)+0x3a5) [0x55c00fbcfe35]
>>>>>>>>>>>>>  3: (PG::append_log(std::vector<pg_log_entry_t,
>>>>>>>>>>>>> std::allocator<pg_log_entry_t> > const&, eversion_t, eversion_t,
>>>>>>>>>>>>> ObjectStore::Transaction&, bool)+0x540) [0x55c00fbf5fc0]
>>>>>>>>>>>>>  4: (PrimaryLogPG::log_operation(std::vector<pg_log_entry_t,
>>>>>>>>>>>>> std::allocator<pg_log_entry_t> > const&,
>>>>>>>>>>>>> boost::optional<pg_hit_set_history_t> const&, eversion_t const&,
>>>>>>>>>>>>> eversion_t const&, bool, ObjectStore::Transaction&)+0x64) [0x55c00fceaeb4]
>>>>>>>>>>>>>  5: (ReplicatedBackend::do_repop(boost::intrusive_ptr<OpRequest>)+0xa92)
>>>>>>>>>>>>> [0x55c00fdf6642]
>>>>>>>>>>>>>  6:
>>>>>>>>>>>>> (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2a4)
>>>>>>>>>>>>> [0x55c00fdfc684]
>>>>>>>>>>>>>  7: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
>>>>>>>>>>>>> [0x55c00fd22030]
>>>>>>>>>>>>>  8: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x77b) [0x55c00fc8e33b]
>>>>>>>>>>>>>  9: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>>>>>>>>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f7)
>>>>>>>>>>>>> [0x55c00fb1abc7]
>>>>>>>>>>>>>  10: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
>>>>>>>>>>>>> const&)+0x57) [0x55c00fd92ad7]
>>>>>>>>>>>>>  11: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x108c) [0x55c00fb49d1c]
>>>>>>>>>>>>>  12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x88d)
>>>>>>>>>>>>> [0x55c0100d5ffd]
>>>>>>>>>>>>>  13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55c0100d7fc0]
>>>>>>>>>>>>>  14: (()+0x8064) [0x7fbd284fb064]
>>>>>>>>>>>>>  15: (clone()+0x6d) [0x7fbd275ef62d]
>>>>>>>>>>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>>>>>>>>>>> needed to interpret this.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am 18.01.2018 um 15:24 schrieb Sage Weil:
>>>>>>>>>>>>>> On Thu, 18 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>>>>>>>>> Hi Sage,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am 18.01.2018 um 14:16 schrieb Sage Weil:
>>>>>>>>>>>>>>>> On Thu, 18 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> it also crashes in (marked with HERE):
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> int SnapMapper::get_snaps(
>>>>>>>>>>>>>>>>>   const hobject_t &oid,
>>>>>>>>>>>>>>>>>   object_snaps *out)
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>   assert(check(oid));
>>>>>>>>>>>>>>>>>   set<string> keys;
>>>>>>>>>>>>>>>>>   map<string, bufferlist> got;
>>>>>>>>>>>>>>>>>   keys.insert(to_object_key(oid));
>>>>>>>>>>>>>>>>>   int r = backend.get_keys(keys, &got);
>>>>>>>>>>>>>>>>>   if (r < 0)
>>>>>>>>>>>>>>>>>     return r;
>>>>>>>>>>>>>>>>>   if (got.empty())
>>>>>>>>>>>>>>>>>     return -ENOENT;
>>>>>>>>>>>>>>>>>   if (out) {
>>>>>>>>>>>>>>>>>     bufferlist::iterator bp = got.begin()->second.begin();
>>>>>>>>>>>>>>>>>     ::decode(*out, bp);
>>>>>>>>>>>>>>>>>     dout(20) << __func__ << " " << oid << " " << out->snaps << dendl;
>>>>>>>>>>>>>>>>>     assert(!out->snaps.empty());            ########### HERE ###########
>>>>>>>>>>>>>>>>>   } else {
>>>>>>>>>>>>>>>>>     dout(20) << __func__ << " " << oid << " (out == NULL)" << dendl;
>>>>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>>>>   return 0;
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> is it save to comment that assert?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think so.  All of these asserts are just about ensuring the snapmapper 
>>>>>>>>>>>>>>>> state is consistent, and it's only purpose is to find snaps to trim.  
>>>>>>>>>>>>>>>> Since your snapmapper metadata is clearly not consistent, there isn't a 
>>>>>>>>>>>>>>>> whole lot of risk here.  You might want to set nosnaptrim for the time 
>>>>>>>>>>>>>>>> being so you don't get a surprise if trimming kicks in.  The next step is 
>>>>>>>>>>>>>>>> probably going to be to do a scrub and see what it finds, repairs, or 
>>>>>>>>>>>>>>>> can't repair.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> snap trimming works fine i already trimmed and removed some snaps.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I was able to get the cluster into a state where all is backfilled. But
>>>>>>>>>>>>>>> enabling / doing deep-scrubs results into this one:
>>>>>>>>>>>>>>>      0> 2018-01-18 13:00:54.843076 7fbd253ff700 -1
>>>>>>>>>>>>>>> /build/ceph/src/osd/SnapMapper.cc: In function 'int
>>>>>>>>>>>>>>> SnapMapper::get_snaps(const h
>>>>>>>>>>>>>>> object_t&, SnapMapper::object_snaps*)' thread 7fbd253ff700 time
>>>>>>>>>>>>>>> 2018-01-18 13:00:54.840396
>>>>>>>>>>>>>>> /build/ceph/src/osd/SnapMapper.cc: 154: FAILED assert(!out->snaps.empty())
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think if you switch that assert to a warning it will repair...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> sage
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  ceph version 12.2.2-94-g92923ef
>>>>>>>>>>>>>>> (92923ef323d32d8321e86703ce1f9016f19472fb) luminous (stable)
>>>>>>>>>>>>>>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>>>>>>>>>>>>> const*)+0x102) [0x5561ce28d1f2]
>>>>>>>>>>>>>>>  2: (SnapMapper::get_snaps(hobject_t const&,
>>>>>>>>>>>>>>> SnapMapper::object_snaps*)+0x46b) [0x5561cdef755b]
>>>>>>>>>>>>>>>  3: (SnapMapper::get_snaps(hobject_t const&, std::set<snapid_t,
>>>>>>>>>>>>>>> std::less<snapid_t>, std::allocator<snapid_t> >*)+0xd7) [0x5561cdef7
>>>>>>>>>>>>>>> 697]
>>>>>>>>>>>>>>>  4: (PG::_scan_snaps(ScrubMap&)+0x692) [0x5561cdd8ad22]
>>>>>>>>>>>>>>>  5: (PG::build_scrub_map_chunk(ScrubMap&, hobject_t, hobject_t, bool,
>>>>>>>>>>>>>>> unsigned int, ThreadPool::TPHandle&)+0x22f) [0x5561cdd8bc5f]
>>>>>>>>>>>>>>>  6: (PG::replica_scrub(boost::intrusive_ptr<OpRequest>,
>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x624) [0x5561cdd8c584]
>>>>>>>>>>>>>>>  7: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x8b6) [0x5561cde4b326]
>>>>>>>>>>>>>>>  8: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>>>>>>>>>>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f7)
>>>>>>>>>>>>>>> [0x5561cdcd7bc7]
>>>>>>>>>>>>>>>  9: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
>>>>>>>>>>>>>>> const&)+0x57) [0x5561cdf4f957]
>>>>>>>>>>>>>>>  10: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x108c) [0x5561cdd06d1c]
>>>>>>>>>>>>>>>  11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x88d)
>>>>>>>>>>>>>>> [0x5561ce292e7d]
>>>>>>>>>>>>>>>  12: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5561ce294e40]
>>>>>>>>>>>>>>>  13: (()+0x8064) [0x7fbd70d86064]
>>>>>>>>>>>>>>>  14: (clone()+0x6d) [0x7fbd6fe7a62d]
>>>>>>>>>>>>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>>>>>>>>>>>>> needed to interpret this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> sage
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>> Am 17.01.2018 um 19:56 schrieb Sage Weil:
>>>>>>>>>>>>>>>>>> On Wed, 17 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Am 17.01.2018 um 19:48 schrieb Sage Weil:
>>>>>>>>>>>>>>>>>>>> On Wed, 17 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>>>>>>>>>>>>>>> Hi Sage,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> this gives me another crash while that pg is recovering:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>      0> 2018-01-17 19:25:09.328935 7f48f8fff700 -1
>>>>>>>>>>>>>>>>>>>>> /build/ceph/src/osd/PrimaryLogPG.cc: In function 'virtual void
>>>>>>>>>>>>>>>>>>>>> PrimaryLogPG::on_l
>>>>>>>>>>>>>>>>>>>>> ocal_recover(const hobject_t&, const ObjectRecoveryInfo&,
>>>>>>>>>>>>>>>>>>>>> ObjectContextRef, bool, ObjectStore::Transaction*)' thread 7f48f8fff700 ti
>>>>>>>>>>>>>>>>>>>>> me 2018-01-17 19:25:09.322287
>>>>>>>>>>>>>>>>>>>>> /build/ceph/src/osd/PrimaryLogPG.cc: 358: FAILED assert(p !=
>>>>>>>>>>>>>>>>>>>>> recovery_info.ss.clone_snaps.end())
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is this a cache tiering pool?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> no normal 3 replica but the pg is degraded:
>>>>>>>>>>>>>>>>>>> ceph pg dump | grep 3.80e
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 3.80e      1709                  0     1579         0       0 6183674368
>>>>>>>>>>>>>>>>>>> 10014    10014 active+undersized+degraded+remapped+backfill_wait
>>>>>>>>>>>>>>>>>>> 2018-01-17 19:42:55.840884  918403'70041375  918403:77171331 [50,54,86]
>>>>>>>>>>>>>>>>>>>        50       [39]             39  907737'69776430 2018-01-14
>>>>>>>>>>>>>>>>>>> 22:19:54.110864  907737'69776430 2018-01-14 22:19:54.110864
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hrm, no real clues on teh root cause then.  Something like this will work 
>>>>>>>>>>>>>>>>>> around the current assert:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> diff --git a/src/osd/PrimaryLogPG.cc b/src/osd/PrimaryLogPG.cc
>>>>>>>>>>>>>>>>>> index d42f3a401b..0f76134f74 100644
>>>>>>>>>>>>>>>>>> --- a/src/osd/PrimaryLogPG.cc
>>>>>>>>>>>>>>>>>> +++ b/src/osd/PrimaryLogPG.cc
>>>>>>>>>>>>>>>>>> @@ -372,8 +372,11 @@ void PrimaryLogPG::on_local_recover(
>>>>>>>>>>>>>>>>>>      set<snapid_t> snaps;
>>>>>>>>>>>>>>>>>>      dout(20) << " snapset " << recovery_info.ss << dendl;
>>>>>>>>>>>>>>>>>>      auto p = recovery_info.ss.clone_snaps.find(hoid.snap);
>>>>>>>>>>>>>>>>>> -    assert(p != recovery_info.ss.clone_snaps.end());  // hmm, should we warn?
>>>>>>>>>>>>>>>>>> -    snaps.insert(p->second.begin(), p->second.end());
>>>>>>>>>>>>>>>>>> +    if (p != recovery_info.ss.clone_snaps.end()) {
>>>>>>>>>>>>>>>>>> +      derr << __func__ << " no clone_snaps for " << hoid << dendl;
>>>>>>>>>>>>>>>>>> +    } else {
>>>>>>>>>>>>>>>>>> +      snaps.insert(p->second.begin(), p->second.end());
>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>>      dout(20) << " snaps " << snaps << dendl;
>>>>>>>>>>>>>>>>>>      snap_mapper.add_oid(
>>>>>>>>>>>>>>>>>>        recovery_info.soid,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> s
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  ceph version 12.2.2-94-g92923ef
>>>>>>>>>>>>>>>>>>>>> (92923ef323d32d8321e86703ce1f9016f19472fb) luminous (stable)
>>>>>>>>>>>>>>>>>>>>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>>>>>>>>>>>>>>>>>>> const*)+0x102) [0x55addb5eb1f2]
>>>>>>>>>>>>>>>>>>>>>  2: (PrimaryLogPG::on_local_recover(hobject_t const&, ObjectRecoveryInfo
>>>>>>>>>>>>>>>>>>>>> const&, std::shared_ptr<ObjectContext>, bool,
>>>>>>>>>>>>>>>>>>>>> ObjectStore::Transaction*)+0x11f0) [0x55addb1957a0]
>>>>>>>>>>>>>>>>>>>>>  3: (ReplicatedBackend::handle_push(pg_shard_t, PushOp const&,
>>>>>>>>>>>>>>>>>>>>> PushReplyOp*, ObjectStore::Transaction*)+0x31d) [0x55addb3071ed]
>>>>>>>>>>>>>>>>>>>>>  4: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x18f)
>>>>>>>>>>>>>>>>>>>>> [0x55addb30748f]
>>>>>>>>>>>>>>>>>>>>>  5:
>>>>>>>>>>>>>>>>>>>>> (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2d1)
>>>>>>>>>>>>>>>>>>>>> [0x55addb317531]
>>>>>>>>>>>>>>>>>>>>>  6: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
>>>>>>>>>>>>>>>>>>>>> [0x55addb23cf10]
>>>>>>>>>>>>>>>>>>>>>  7: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>>>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x77b) [0x55addb1a91eb]
>>>>>>>>>>>>>>>>>>>>>  8: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>>>>>>>>>>>>>>>>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f7)
>>>>>>>>>>>>>>>>>>>>> [0x55addb035bc7]
>>>>>>>>>>>>>>>>>>>>>  9: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
>>>>>>>>>>>>>>>>>>>>> const&)+0x57) [0x55addb2ad957]
>>>>>>>>>>>>>>>>>>>>>  10: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x108c) [0x55addb064d1c]
>>>>>>>>>>>>>>>>>>>>>  11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x88d)
>>>>>>>>>>>>>>>>>>>>> [0x55addb5f0e7d]
>>>>>>>>>>>>>>>>>>>>>  12: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55addb5f2e40]
>>>>>>>>>>>>>>>>>>>>>  13: (()+0x8064) [0x7f4955b68064]
>>>>>>>>>>>>>>>>>>>>>  14: (clone()+0x6d) [0x7f4954c5c62d]
>>>>>>>>>>>>>>>>>>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>>>>>>>>>>>>>>>>>>> needed to interpret this.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Am 17.01.2018 um 15:28 schrieb Sage Weil:
>>>>>>>>>>>>>>>>>>>>>> On Wed, 17 Jan 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> i there any chance to fix this instead of removing manually all the clones?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I believe you can avoid the immediate problem and get the PG up by 
>>>>>>>>>>>>>>>>>>>>>> commenting out the assert.  set_snaps() will overwrite the object->snap 
>>>>>>>>>>>>>>>>>>>>>> list mapping.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> The problem is you'll probably still a stray snapid -> object mapping, so 
>>>>>>>>>>>>>>>>>>>>>> when snaptrimming runs you might end up with a PG in the snaptrim_error 
>>>>>>>>>>>>>>>>>>>>>> state that won't trim (although from a quick look at the code it won't 
>>>>>>>>>>>>>>>>>>>>>> crash).  I'd probably remove the assert and deal with that if/when it 
>>>>>>>>>>>>>>>>>>>>>> happens.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I'm adding a ticket to relax these asserts for production but keep them 
>>>>>>>>>>>>>>>>>>>>>> enabled for qa.  This isn't something that needs to take down the OSD!
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> sage
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>  > 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Am 16.01.2018 um 23:24 schrieb Gregory Farnum:
>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Jan 15, 2018 at 5:23 PM, Stefan Priebe - Profihost AG
>>>>>>>>>>>>>>>>>>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> currently one of my clusters is missing a whole pg due to all 3 osds
>>>>>>>>>>>>>>>>>>>>>>>>> being down.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> All of them fail with:
>>>>>>>>>>>>>>>>>>>>>>>>>     0> 2018-01-16 02:05:33.353293 7f944dbfe700 -1
>>>>>>>>>>>>>>>>>>>>>>>>> /build/ceph/src/osd/SnapMapper.cc: In function 'void
>>>>>>>>>>>>>>>>>>>>>>>>> SnapMapper::add_oid(const hobject_t&, const std::set<snapid_t>&,
>>>>>>>>>>>>>>>>>>>>>>>>> MapCacher::Transaction<std::basic_string<char>, ceph::buffer::list>*)'
>>>>>>>>>>>>>>>>>>>>>>>>> thread 7f944dbfe700 time 2018-01-16 02:05:33.349946
>>>>>>>>>>>>>>>>>>>>>>>>> /build/ceph/src/osd/SnapMapper.cc: 246: FAILED assert(r == -2)
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>  ceph version 12.2.2-93-gd6da8d7
>>>>>>>>>>>>>>>>>>>>>>>>> (d6da8d77a4b2220e6bdd61e4bdd911a9cd91946c) luminous (stable)
>>>>>>>>>>>>>>>>>>>>>>>>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>>>>>>>>>>>>>>>>>>>>>>> const*)+0x102) [0x561f9ff0b1e2]
>>>>>>>>>>>>>>>>>>>>>>>>>  2: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t,
>>>>>>>>>>>>>>>>>>>>>>>>> std::less<snapid_t>, std::allocator<snapid_t> > const&,
>>>>>>>>>>>>>>>>>>>>>>>>> MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x64b)
>>>>>>>>>>>>>>>>>>>>>>>>> [0x561f9fb76f3b]
>>>>>>>>>>>>>>>>>>>>>>>>>  3: (PG::update_snap_map(std::vector<pg_log_entry_t,
>>>>>>>>>>>>>>>>>>>>>>>>> std::allocator<pg_log_entry_t> > const&,
>>>>>>>>>>>>>>>>>>>>>>>>> ObjectStore::Transaction&)+0x38f) [0x561f9fa0ae3f]
>>>>>>>>>>>>>>>>>>>>>>>>>  4: (PG::append_log(std::vector<pg_log_entry_t,
>>>>>>>>>>>>>>>>>>>>>>>>> std::allocator<pg_log_entry_t> > const&, eversion_t, eversion_t,
>>>>>>>>>>>>>>>>>>>>>>>>> ObjectStore::Transaction&, bool)+0x538) [0x561f9fa31018]
>>>>>>>>>>>>>>>>>>>>>>>>>  5: (PrimaryLogPG::log_operation(std::vector<pg_log_entry_t,
>>>>>>>>>>>>>>>>>>>>>>>>> std::allocator<pg_log_entry_t> > const&,
>>>>>>>>>>>>>>>>>>>>>>>>> boost::optional<pg_hit_set_history_t> const&, eversion_t const&,
>>>>>>>>>>>>>>>>>>>>>>>>> eversion_t const&, bool, ObjectStore::Transaction&)+0x64) [0x561f9fb25d64]
>>>>>>>>>>>>>>>>>>>>>>>>>  6: (ReplicatedBackend::do_repop(boost::intrusive_ptr<OpRequest>)+0xa92)
>>>>>>>>>>>>>>>>>>>>>>>>> [0x561f9fc314b2]
>>>>>>>>>>>>>>>>>>>>>>>>>  7:
>>>>>>>>>>>>>>>>>>>>>>>>> (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2a4)
>>>>>>>>>>>>>>>>>>>>>>>>> [0x561f9fc374f4]
>>>>>>>>>>>>>>>>>>>>>>>>>  8: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
>>>>>>>>>>>>>>>>>>>>>>>>> [0x561f9fb5cf10]
>>>>>>>>>>>>>>>>>>>>>>>>>  9: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>>>>>>>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x77b) [0x561f9fac91eb]
>>>>>>>>>>>>>>>>>>>>>>>>>  10: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>>>>>>>>>>>>>>>>>>>>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f7)
>>>>>>>>>>>>>>>>>>>>>>>>> [0x561f9f955bc7]
>>>>>>>>>>>>>>>>>>>>>>>>>  11: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
>>>>>>>>>>>>>>>>>>>>>>>>> const&)+0x57) [0x561f9fbcd947]
>>>>>>>>>>>>>>>>>>>>>>>>>  12: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>>>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x108c) [0x561f9f984d1c]
>>>>>>>>>>>>>>>>>>>>>>>>>  13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x88d)
>>>>>>>>>>>>>>>>>>>>>>>>> [0x561f9ff10e6d]
>>>>>>>>>>>>>>>>>>>>>>>>>  14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x561f9ff12e30]
>>>>>>>>>>>>>>>>>>>>>>>>>  15: (()+0x8064) [0x7f949afcb064]
>>>>>>>>>>>>>>>>>>>>>>>>>  16: (clone()+0x6d) [0x7f949a0bf62d]
>>>>>>>>>>>>>>>>>>>>>>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>>>>>>>>>>>>>>>>>>>>>>> needed to interpret this.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> By the time it gets there, something else has gone wrong. The OSD is
>>>>>>>>>>>>>>>>>>>>>>>> adding a snapid/object pair to its "SnapMapper", and discovering that
>>>>>>>>>>>>>>>>>>>>>>>> there are already entries (which it thinks there shouldn't be).
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> You'll need to post more of a log, along with background, if anybody's
>>>>>>>>>>>>>>>>>>>>>>>> going to diagnose it: is there cache tiering on the cluster? What is
>>>>>>>>>>>>>>>>>>>>>>>> this pool used for? Were there other errors on this PG in the past?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I also notice a separate email about deleting the data; I don't have
>>>>>>>>>>>>>>>>>>>>>>>> any experience with this but you'd probably have to export the PG
>>>>>>>>>>>>>>>>>>>>>>>> using ceph-objectstore-tool and then find a way to delete the object
>>>>>>>>>>>>>>>>>>>>>>>> out of it. I see options to remove both an object and
>>>>>>>>>>>>>>>>>>>>>>>> "remove-clone-metadata" on a particular ID, but I've not used any of
>>>>>>>>>>>>>>>>>>>>>>>> them myself.
>>>>>>>>>>>>>>>>>>>>>>>> -Greg
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>>>>>>>>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>>>>>>>>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>>>>>>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>>>>>>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>>>>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>>>>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html