Re: cephfs metadata pool: deep-scrub error "omap_digest != best guess omap_digest"

Brad Hubbard <bhubbard@xxxxxxxxxx> · Wed, 31 Aug 2016 13:14:56 +1000

On Wed, Aug 31, 2016 at 9:56 AM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
>
>
> On Tue, Aug 30, 2016 at 9:18 PM, Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote:
>> Just a small typo correction to my previous email. Without it the meaning was completely different:.
>>
>> "At this point I just need a way to recover the pg safely and I do NOT see how I can do that since it is impossible to understand what is the problematic osd with the incoherent object."

As you can see from the ReplicatedBackend::be_deep_scrub function here there is
a lot of information that can be seen if we have debug logging with "debug osd =
30".

https://github.com/badone/ceph/blob/master/src/osd/ReplicatedBackend.cc#L738

You can see how around this line it gets the header.

https://github.com/badone/ceph/blob/master/src/osd/ReplicatedBackend.cc#L778

Then the code walks through the keys.

https://github.com/badone/ceph/blob/master/src/osd/ReplicatedBackend.cc#L809

The header and the keys are streamed to our bufferhash object "oh"

https://github.com/badone/ceph/blob/master/src/osd/ReplicatedBackend.cc#L823

Then the digest (crc) is assigned to our omap_digest.

https://github.com/badone/ceph/blob/master/src/osd/ReplicatedBackend.cc#L834-L835

You can see from the dout(25) calls that debugging has to be set high for us to
be able to capture this output which should help a lot.

-- 
Cheers,
Brad

>
> Hi Goncalo,
>
> A couple of things.
>
> In my last post I got the code wrong, I posted the code for the actual digest,
> not the omap digest, but they are essentially the same. Here's the code I meant
> to post.
>
> 438   if (auth.omap_digest_present && candidate.omap_digest_present) {
> 439     if (auth.omap_digest != candidate.omap_digest) {
> 440       if (error != CLEAN)
> 441         errorstream << ", ";
> 442       error = DEEP_ERROR;
> 443       bool known = auth_oi.is_omap_digest() &&
> 444         auth.omap_digest == auth_oi.omap_digest;
> 445       errorstream << "omap_digest 0x" << std::hex << candidate.omap_digest
> 446                   << " != "
> 447                   << (known ? "known" : "best guess")
> 448                   << " omap_digest 0x" << auth.omap_digest << std::dec
> 449                   << " from auth shard " << auth_shard;
> 450       result.set_omap_digest_mismatch();
> 451     }
> 452   }
>
> Here is the message from the commit for this code which gives a little more
> insight.
>
> osd: be precise about "known" vs "best guess"
>
> We cannot assume that the auth info has the "known" digest; all replicas
> may have a matching digest that does not match the oi, or we may choose
> a different auth info for some other reason.  Verify that the digest
> matches the oi before calling it "known".
>
> So we can only say that a digest is "known" if it matches the digest stored in
> the object_info_t, otherwise it is a "best guess".
>
> Looking at the pg query that you posted it seems to give some clues as for both
> peer 49 and 59 it gives the following statistics.
>
> "num_objects_degraded": 2
> "last_undegraded": "2016-08-25 06:41:22.446581"
>
> So it appears to indicate that the two replicas are considered divergent from
> the primary. Note the last_undegraded date is the same day as "head" was created
> on the primary. So it looks like the difference in the digest may have happened
> at the time this OSD became primary and was populated with this pg's data. This
> *might* (speculation) happen if the primary were running a different version
> from the replicas when it was introduced into the cluster (inspection of the
> logs looking for SHA1 version signatures from around this period would confirm
> or refute this. Look for the "ceph version" line logged at startup or run "ceph
> daemon /path/to/osd.49.asok version" for each OSD if they have not been
> restarted). Another possibility is this is a bug in the way we populate the omap
> data when a new primary comes online but I would consider this less likely
> (although certainly not impossible). Another possibility is a difference in the
> way the CRC for the omap digest is being calculated on the peers.
>
>> ________________________________________
>> From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Goncalo Borges [goncalo.borges@xxxxxxxxxxxxx]
>> Sent: 30 August 2016 18:53
>> To: Brad Hubbard
>> Cc: ceph-users@xxxxxxxx
>> Subject: Re:  cephfs metadata pool: deep-scrub error "omap_digest != best guess omap_digest"
>>
>>
>> I can run a deep scrub with the log levels mentioned if it safe to do it in an inconsistent pg. I have read somewhere that it shouldn't be done but I do not remember where and how precise is that info. Is it safe to do so?
>
> AFAIK the only adverse affects are that it could possibly affect performance and
> possibly fill up your disk where logs are stored so I would suggest doing it
> during a quiet period and monitoring disk usage closely. It should only need to
> be enabled for a relatively short amount of time and should not require a
> restart so I would think it was fairly safe although I would like to read what
> you read to understand better what you are referring to.
>
>> At this point I just need a way to recover the pg safely and I do see how I can do that since it is impossible to understand what is the problematic osd with the incoherent object.
>>
>> I also think I am not the only seeing it. I participated in a discussion last time of exactly the same issue experienced by someone else in jewel
>
> I went back to that thread and it appears that was also a metadata pool so,
> whatever this is, it seems to require a metadata pool for this to happen which
> is odd but may be related to the volume or way in which those pools use the omap
> functionality.
>
> I think at this stage we really need to get a tracker opened for this and start
> looking at debug logging and/or ceph-objectstore-tool output. It would also be
> interesting to get additional information from others affected by this issue and
> a tracker would be the best place to gather that.
>
> --
> Cheers,
> Brad
>
>>
>> Cheers
>> Goncalo
>>
>>
>> ________________________________________
>> From: Brad Hubbard [bhubbard@xxxxxxxxxx]
>> Sent: 30 August 2016 17:13
>> To: Goncalo Borges
>> Cc: ceph-users@xxxxxxxx
>> Subject: Re:  cephfs metadata pool: deep-scrub error "omap_digest != best guess omap_digest"
>>
>> On Tue, Aug 30, 2016 at 3:50 PM, Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote:
>>> Dear Ceph / CephFS supports.
>>>
>>> We are currently running Jewel 10.2.2.
>>>
>>> From time to time we experience deep-scrub errors in pgs inside our cephfs
>>> metadata pool. It is important to note that we do not see any hardware
>>> errors on the osds themselves so the error must have some other source.
>>
>> Can you verify that all your nodes are the same architecture and all running the
>> same ceph version? This sort of thing has been reported before when running
>> mismatched versions and/or architectures.
>>
>> http://tracker.ceph.com/issues/4743 looks similar but is very old and likely not
>> relevant. In that case it was recommended to get logs with  debug filestore = 20
>> debug osd = 30 debug ms = 1 logs from all three replicas of such a PG while
>> running a deep scrub on it so gathering those may be a good idea as is opening a
>> new tracker for this.
>>
>>>
>>> The error itself is the following:
>>>
>>> # cat /var/log/ceph/ceph.log| grep 5.3d0
>>> 2016-08-30 00:30:53.492626 osd.78 192.231.127.171:6828/6072 331 : cluster
>>> [INF] 5.3d0 deep-scrub starts
>>> 2016-08-30 00:30:54.276134 osd.78 192.231.127.171:6828/6072 332 : cluster
>>> [ERR] 5.3d0 shard 78: soid 5:0bd6d154:::602.00000000:head omap_digest
>>> 0xf3fdfd0c != best guess omap_digest 0x23b2eae0 from auth shard 49
>>> 2016-08-30 00:30:54.747795 osd.78 192.231.127.171:6828/6072 333 : cluster
>>> [ERR] 5.3d0 deep-scrub 0 missing, 1 inconsistent objects
>>> 2016-08-30 00:30:54.747801 osd.78 192.231.127.171:6828/6072 334 : cluster
>>> [ERR] 5.3d0 deep-scrub 1 errors
>>
>> AFAIU the omap_digest is a calculated CRC32 of omap header & key/values. These
>> values are stored in the OSD leveldb and not the data directories but I think
>> Greg already mentioned that last time?
>>
>>>
>>> For us there are a few unknowns on how to recover from this error:
>>>
>>> 1) The first issue is that we do really do not understand the nature of the
>>> error. What does it mean "omap_digest != best guess omap_digest"? it seems
>>> to point to some problem in the digest of omap contents between the two osds
>>> but does not tell you exactly what.
>>
>> src/osd/PGBackend.cc:
>>
>> 423   if (auth.digest_present && candidate.digest_present) {
>> 424     if (auth.digest != candidate.digest) {
>> 425       if (error != CLEAN)
>> 426         errorstream << ", ";
>> 427       error = DEEP_ERROR;
>> 428       bool known = auth_oi.is_data_digest() &&
>> 429         auth.digest == auth_oi.data_digest;
>> 430       errorstream << "data_digest 0x" << std::hex << candidate.digest
>> 431                   << " != "
>> 432                   << (known ? "known" : "best guess")
>> 433                   << " data_digest 0x" << auth.digest << std::dec
>> 434                   << " from auth shard " << auth_shard;
>> 435       result.set_data_digest_mismatch();
>> 436     }
>> 437   }
>>
>> On line 428 "known" is either set or it isn't. With precedence it actually looks
>> like this.
>>
>> bool known = (((auth_oi.is_omap_digest)()) && ((auth.omap_digest) == (auth_oi.omap_digest)));
>>
>> auth_oi is an object_info_t so it looks like we are comparing the digest to a
>> copy we already have stored and if they match the result is considered "known",
>> otherwise it's considered a "best guess". Hopefully someone can elaborate on
>> what this means.
>>
>>>
>>> 2) The second issue is that it is really difficult to try to explore
>>> metadata objects and omap info. While in the data pool we do know how to
>>> inspect pgs and object contents (and decide what is the problematic osd by
>>> comparison in a 3 replica setup), in the metadata pool we have to access pg
>>> contents using 'ceph-objectstore-tool'. For that, we have to stop the osd so
>>> that the deamon releases the omap lock. Moreover, I have successfully
>>> imported / exported / listed pgs contents but I was never able to query omap
>>> contents of objects inside pgs. Maybe I am doing it wrong but I do not find
>>> the tool helpful at the moment for this precise task.
>>>
>>> 3)  Finally, I am unsure what is the consequence of running 'pg repair'. In
>>> my specific case, the primary osd is on a host which was recently added to
>>> production. Moreover, the dates of the problematic object match for the
>>> secondary piers which gives me the feeling that the primary osd (78) might
>>> be the problematic one. I know that in the past, the default behavior was
>>> simply to copy the pg contents of the primary osd to the others. That can
>>> lead to data corruption if the problematic osd is indeed the primary, and I
>>> wonder if in Jewel there is some smarter way to do the pg repair.
>>>
>>> [root@server9 ~]# ll
>>> /var/lib/ceph/osd/ceph-78/current/5.3d0_head/602.00000000__head_2A8B6BD0__5
>>> -rw-r--r-- 1 ceph ceph 0 Aug 25 21:41
>>> /var/lib/ceph/osd/ceph-78/current/5.3d0_head/602.00000000__head_2A8B6BD0__5
>>>
>>>
>>> [root@server7 ~]# ll
>>> /var/lib/ceph/osd/ceph-49/current/5.3d0_head/602.00000000__head_2A8B6BD0__5
>>> -rw-r--r-- 1 ceph ceph 0 Jul 27 02:30
>>> /var/lib/ceph/osd/ceph-49/current/5.3d0_head/602.00000000__head_2A8B6BD0__5
>>>
>>>
>>> [root@server8 ~]# ll
>>> /var/lib/ceph/osd/ceph-59/current/5.3d0_head/602.00000000__head_2A8B6BD0__5
>>> -rw-r--r-- 1 ceph ceph 0 Jul 27 02:30
>>> /var/lib/ceph/osd/ceph-59/current/5.3d0_head/602.00000000__head_2A8B6BD0__5
>>>
>>>
>>> A pg query does not seem to give me any information about the problem. The
>>> query results follows this email.
>>>
>>> Any help is appreciated.
>>>
>>> Cheers
>>> Goncalo
>>>
>>> --- * ---
>>>
>>>
>>>
>>> # ceph pg 5.3d0 query
>>> {
>>>     "state": "active+clean+inconsistent",
>>>     "snap_trimq": "[]",
>>>     "epoch": 23099,
>>>     "up": [
>>>         78,
>>>         59,
>>>         49
>>>     ],
>>>     "acting": [
>>>         78,
>>>         59,
>>>         49
>>>     ],
>>>     "actingbackfill": [
>>>         "49",
>>>         "59",
>>>         "78"
>>>     ],
>>>     "info": {
>>>         "pgid": "5.3d0",
>>>         "last_update": "23099'104726",
>>>         "last_complete": "23099'104726",
>>>         "log_tail": "23099'101639",
>>>         "last_user_version": 104726,
>>>         "last_backfill": "MAX",
>>>         "last_backfill_bitwise": 1,
>>>         "purged_snaps": "[]",
>>>         "history": {
>>>             "epoch_created": 339,
>>>             "last_epoch_started": 22440,
>>>             "last_epoch_clean": 22440,
>>>             "last_epoch_split": 0,
>>>             "last_epoch_marked_full": 0,
>>>             "same_up_since": 19928,
>>>             "same_interval_since": 22439,
>>>             "same_primary_since": 22439,
>>>             "last_scrub": "23099'104724",
>>>             "last_scrub_stamp": "2016-08-30 00:30:54.747833",
>>>             "last_deep_scrub": "23099'104724",
>>>             "last_deep_scrub_stamp": "2016-08-30 00:30:54.747833",
>>>             "last_clean_scrub_stamp": "2016-08-29 00:30:33.716646"
>>>         },
>>>         "stats": {
>>>             "version": "23099'104726",
>>>             "reported_seq": "84233",
>>>             "reported_epoch": "23099",
>>>             "state": "active+clean+inconsistent",
>>>             "last_fresh": "2016-08-30 02:40:35.963747",
>>>             "last_change": "2016-08-30 00:30:54.747882",
>>>             "last_active": "2016-08-30 02:40:35.963747",
>>>             "last_peered": "2016-08-30 02:40:35.963747",
>>>             "last_clean": "2016-08-30 02:40:35.963747",
>>>             "last_became_active": "2016-08-25 21:41:30.649369",
>>>             "last_became_peered": "2016-08-25 21:41:30.649369",
>>>             "last_unstale": "2016-08-30 02:40:35.963747",
>>>             "last_undegraded": "2016-08-30 02:40:35.963747",
>>>             "last_fullsized": "2016-08-30 02:40:35.963747",
>>>             "mapping_epoch": 19928,
>>>             "log_start": "23099'101639",
>>>             "ondisk_log_start": "23099'101639",
>>>             "created": 339,
>>>             "last_epoch_clean": 22440,
>>>             "parent": "0.0",
>>>             "parent_split_bits": 0,
>>>             "last_scrub": "23099'104724",
>>>             "last_scrub_stamp": "2016-08-30 00:30:54.747833",
>>>             "last_deep_scrub": "23099'104724",
>>>             "last_deep_scrub_stamp": "2016-08-30 00:30:54.747833",
>>>             "last_clean_scrub_stamp": "2016-08-29 00:30:33.716646",
>>>             "log_size": 3087,
>>>             "ondisk_log_size": 3087,
>>>             "stats_invalid": false,
>>>             "dirty_stats_invalid": false,
>>>             "omap_stats_invalid": false,
>>>             "hitset_stats_invalid": false,
>>>             "hitset_bytes_stats_invalid": false,
>>>             "pin_stats_invalid": true,
>>>             "stat_sum": {
>>>                 "num_bytes": 0,
>>>                 "num_objects": 257,
>>>                 "num_object_clones": 0,
>>>                 "num_object_copies": 771,
>>>                 "num_objects_missing_on_primary": 0,
>>>                 "num_objects_missing": 0,
>>>                 "num_objects_degraded": 0,
>>>                 "num_objects_misplaced": 0,
>>>                 "num_objects_unfound": 0,
>>>                 "num_objects_dirty": 257,
>>>                 "num_whiteouts": 0,
>>>                 "num_read": 21865,
>>>                 "num_read_kb": 378449,
>>>                 "num_write": 106287,
>>>                 "num_write_kb": 402800,
>>>                 "num_scrub_errors": 1,
>>>                 "num_shallow_scrub_errors": 0,
>>>                 "num_deep_scrub_errors": 1,
>>>                 "num_objects_recovered": 2006,
>>>                 "num_bytes_recovered": 0,
>>>                 "num_keys_recovered": 124614,
>>>                 "num_objects_omap": 257,
>>>                 "num_objects_hit_set_archive": 0,
>>>                 "num_bytes_hit_set_archive": 0,
>>>                 "num_flush": 0,
>>>                 "num_flush_kb": 0,
>>>                 "num_evict": 0,
>>>                 "num_evict_kb": 0,
>>>                 "num_promote": 0,
>>>                 "num_flush_mode_high": 0,
>>>                 "num_flush_mode_low": 0,
>>>                 "num_evict_mode_some": 0,
>>>                 "num_evict_mode_full": 0,
>>>                 "num_objects_pinned": 0
>>>             },
>>>             "up": [
>>>                 78,
>>>                 59,
>>>                 49
>>>             ],
>>>             "acting": [
>>>                 78,
>>>                 59,
>>>                 49
>>>             ],
>>>             "blocked_by": [],
>>>             "up_primary": 78,
>>>             "acting_primary": 78
>>>         },
>>>         "empty": 0,
>>>         "dne": 0,
>>>         "incomplete": 0,
>>>         "last_epoch_started": 22440,
>>>         "hit_set_history": {
>>>             "current_last_update": "0'0",
>>>             "history": []
>>>         }
>>>     },
>>>     "peer_info": [
>>>         {
>>>             "peer": "49",
>>>             "pgid": "5.3d0",
>>>             "last_update": "23099'104726",
>>>             "last_complete": "23099'104726",
>>>             "log_tail": "1963'93313",
>>>             "last_user_version": 96444,
>>>             "last_backfill": "MAX",
>>>             "last_backfill_bitwise": 1,
>>>             "purged_snaps": "[]",
>>>             "history": {
>>>                 "epoch_created": 339,
>>>                 "last_epoch_started": 22440,
>>>                 "last_epoch_clean": 22440,
>>>                 "last_epoch_split": 0,
>>>                 "last_epoch_marked_full": 0,
>>>                 "same_up_since": 19928,
>>>                 "same_interval_since": 22439,
>>>                 "same_primary_since": 22439,
>>>                 "last_scrub": "23099'104724",
>>>                 "last_scrub_stamp": "2016-08-30 00:30:54.747833",
>>>                 "last_deep_scrub": "23099'104724",
>>>                 "last_deep_scrub_stamp": "2016-08-30 00:30:54.747833",
>>>                 "last_clean_scrub_stamp": "2016-08-29 00:30:33.716646"
>>>             },
>>>             "stats": {
>>>                 "version": "20737'96443",
>>>                 "reported_seq": "77807",
>>>                 "reported_epoch": "22439",
>>>                 "state": "active+remapped+wait_backfill",
>>>                 "last_fresh": "2016-08-25 06:41:22.446581",
>>>                 "last_change": "2016-08-24 23:56:20.865302",
>>>                 "last_active": "2016-08-25 06:41:22.446581",
>>>                 "last_peered": "2016-08-25 06:41:22.446581",
>>>                 "last_clean": "2016-08-24 13:42:06.161197",
>>>                 "last_became_active": "2016-08-24 23:56:19.815284",
>>>                 "last_became_peered": "2016-08-24 23:56:19.815284",
>>>                 "last_unstale": "2016-08-25 06:41:22.446581",
>>>                 "last_undegraded": "2016-08-25 06:41:22.446581",
>>>                 "last_fullsized": "2016-08-25 06:41:22.446581",
>>>                 "mapping_epoch": 19928,
>>>                 "log_start": "1963'93313",
>>>                 "ondisk_log_start": "1963'93313",
>>>                 "created": 339,
>>>                 "last_epoch_clean": 17445,
>>>                 "parent": "0.0",
>>>                 "parent_split_bits": 0,
>>>                 "last_scrub": "19699'96439",
>>>                 "last_scrub_stamp": "2016-08-24 22:59:27.749260",
>>>                 "last_deep_scrub": "16645'96391",
>>>                 "last_deep_scrub_stamp": "2016-08-22 20:21:59.567449",
>>>                 "last_clean_scrub_stamp": "2016-08-24 22:59:27.749260",
>>>                 "log_size": 3130,
>>>                 "ondisk_log_size": 3130,
>>>                 "stats_invalid": false,
>>>                 "dirty_stats_invalid": false,
>>>                 "omap_stats_invalid": false,
>>>                 "hitset_stats_invalid": false,
>>>                 "hitset_bytes_stats_invalid": false,
>>>                 "pin_stats_invalid": true,
>>>                 "stat_sum": {
>>>                     "num_bytes": 0,
>>>                     "num_objects": 252,
>>>                     "num_object_clones": 0,
>>>                     "num_object_copies": 1008,
>>>                     "num_objects_missing_on_primary": 0,
>>>                     "num_objects_missing": 0,
>>>                     "num_objects_degraded": 2,
>>>                     "num_objects_misplaced": 504,
>>>                     "num_objects_unfound": 0,
>>>                     "num_objects_dirty": 252,
>>>                     "num_whiteouts": 0,
>>>                     "num_read": 21538,
>>>                     "num_read_kb": 323200,
>>>                     "num_write": 97965,
>>>                     "num_write_kb": 354745,
>>>                     "num_scrub_errors": 0,
>>>                     "num_shallow_scrub_errors": 0,
>>>                     "num_deep_scrub_errors": 0,
>>>                     "num_objects_recovered": 2006,
>>>                     "num_bytes_recovered": 0,
>>>                     "num_keys_recovered": 124614,
>>>                     "num_objects_omap": 252,
>>>                     "num_objects_hit_set_archive": 0,
>>>                     "num_bytes_hit_set_archive": 0,
>>>                     "num_flush": 0,
>>>                     "num_flush_kb": 0,
>>>                     "num_evict": 0,
>>>                     "num_evict_kb": 0,
>>>                     "num_promote": 0,
>>>                     "num_flush_mode_high": 0,
>>>                     "num_flush_mode_low": 0,
>>>                     "num_evict_mode_some": 0,
>>>                     "num_evict_mode_full": 0,
>>>                     "num_objects_pinned": 0
>>>                 },
>>>                 "up": [
>>>                     78,
>>>                     59,
>>>                     49
>>>                 ],
>>>                 "acting": [
>>>                     78,
>>>                     59,
>>>                     49
>>>                 ],
>>>                 "blocked_by": [],
>>>                 "up_primary": 78,
>>>                 "acting_primary": 78
>>>             },
>>>             "empty": 0,
>>>             "dne": 0,
>>>             "incomplete": 0,
>>>             "last_epoch_started": 22440,
>>>             "hit_set_history": {
>>>                 "current_last_update": "0'0",
>>>                 "history": []
>>>             }
>>>         },
>>>         {
>>>             "peer": "59",
>>>             "pgid": "5.3d0",
>>>             "last_update": "23099'104726",
>>>             "last_complete": "23099'104726",
>>>             "log_tail": "1963'93313",
>>>             "last_user_version": 96444,
>>>             "last_backfill": "MAX",
>>>             "last_backfill_bitwise": 1,
>>>             "purged_snaps": "[]",
>>>             "history": {
>>>                 "epoch_created": 339,
>>>                 "last_epoch_started": 22440,
>>>                 "last_epoch_clean": 22440,
>>>                 "last_epoch_split": 0,
>>>                 "last_epoch_marked_full": 0,
>>>                 "same_up_since": 19928,
>>>                 "same_interval_since": 22439,
>>>                 "same_primary_since": 22439,
>>>                 "last_scrub": "23099'104724",
>>>                 "last_scrub_stamp": "2016-08-30 00:30:54.747833",
>>>                 "last_deep_scrub": "23099'104724",
>>>                 "last_deep_scrub_stamp": "2016-08-30 00:30:54.747833",
>>>                 "last_clean_scrub_stamp": "2016-08-29 00:30:33.716646"
>>>             },
>>>             "stats": {
>>>                 "version": "20737'96444",
>>>                 "reported_seq": "77806",
>>>                 "reported_epoch": "22437",
>>>                 "state": "active+remapped",
>>>                 "last_fresh": "2016-08-25 21:41:28.869909",
>>>                 "last_change": "2016-08-25 21:41:28.869350",
>>>                 "last_active": "2016-08-25 21:41:28.869909",
>>>                 "last_peered": "2016-08-25 21:41:28.869909",
>>>                 "last_clean": "2016-08-24 13:42:06.161197",
>>>                 "last_became_active": "2016-08-24 23:56:19.815284",
>>>                 "last_became_peered": "2016-08-24 23:56:19.815284",
>>>                 "last_unstale": "2016-08-25 21:41:28.869909",
>>>                 "last_undegraded": "2016-08-25 21:41:28.869909",
>>>                 "last_fullsized": "2016-08-25 21:41:28.869909",
>>>                 "mapping_epoch": 19928,
>>>                 "log_start": "1963'93313",
>>>                 "ondisk_log_start": "1963'93313",
>>>                 "created": 339,
>>>                 "last_epoch_clean": 22437,
>>>                 "parent": "0.0",
>>>                 "parent_split_bits": 0,
>>>                 "last_scrub": "19699'96439",
>>>                 "last_scrub_stamp": "2016-08-24 22:59:27.749260",
>>>                 "last_deep_scrub": "16645'96391",
>>>                 "last_deep_scrub_stamp": "2016-08-22 20:21:59.567449",
>>>                 "last_clean_scrub_stamp": "2016-08-24 22:59:27.749260",
>>>                 "log_size": 3131,
>>>                 "ondisk_log_size": 3131,
>>>                 "stats_invalid": false,
>>>                 "dirty_stats_invalid": false,
>>>                 "omap_stats_invalid": false,
>>>                 "hitset_stats_invalid": false,
>>>                 "hitset_bytes_stats_invalid": false,
>>>                 "pin_stats_invalid": true,
>>>                 "stat_sum": {
>>>                     "num_bytes": 0,
>>>                     "num_objects": 252,
>>>                     "num_object_clones": 0,
>>>                     "num_object_copies": 1008,
>>>                     "num_objects_missing_on_primary": 0,
>>>                     "num_objects_missing": 0,
>>>                     "num_objects_degraded": 2,
>>>                     "num_objects_misplaced": 252,
>>>                     "num_objects_unfound": 0,
>>>                     "num_objects_dirty": 252,
>>>                     "num_whiteouts": 0,
>>>                     "num_read": 21538,
>>>                     "num_read_kb": 323200,
>>>                     "num_write": 97965,
>>>                     "num_write_kb": 354745,
>>>                     "num_scrub_errors": 0,
>>>                     "num_shallow_scrub_errors": 0,
>>>                     "num_deep_scrub_errors": 0,
>>>                     "num_objects_recovered": 2510,
>>>                     "num_bytes_recovered": 0,
>>>                     "num_keys_recovered": 136360,
>>>                     "num_objects_omap": 252,
>>>                     "num_objects_hit_set_archive": 0,
>>>                     "num_bytes_hit_set_archive": 0,
>>>                     "num_flush": 0,
>>>                     "num_flush_kb": 0,
>>>                     "num_evict": 0,
>>>                     "num_evict_kb": 0,
>>>                     "num_promote": 0,
>>>                     "num_flush_mode_high": 0,
>>>                     "num_flush_mode_low": 0,
>>>                     "num_evict_mode_some": 0,
>>>                     "num_evict_mode_full": 0,
>>>                     "num_objects_pinned": 0
>>>                 },
>>>                 "up": [
>>>                     78,
>>>                     59,
>>>                     49
>>>                 ],
>>>                 "acting": [
>>>                     78,
>>>                     59,
>>>                     49
>>>                 ],
>>>                 "blocked_by": [],
>>>                 "up_primary": 78,
>>>                 "acting_primary": 78
>>>             },
>>>             "empty": 0,
>>>             "dne": 0,
>>>             "incomplete": 0,
>>>             "last_epoch_started": 22440,
>>>             "hit_set_history": {
>>>                 "current_last_update": "0'0",
>>>                 "history": []
>>>             }
>>>         }
>>>     ],
>>>     "recovery_state": [
>>>         {
>>>             "name": "Started\/Primary\/Active",
>>>             "enter_time": "2016-08-25 21:41:30.400460",
>>>             "might_have_unfound": [],
>>>             "recovery_progress": {
>>>                 "backfill_targets": [],
>>>                 "waiting_on_backfill": [],
>>>                 "last_backfill_started": "MIN",
>>>                 "backfill_info": {
>>>                     "begin": "MIN",
>>>                     "end": "MIN",
>>>                     "objects": []
>>>                 },
>>>                 "peer_backfill_info": [],
>>>                 "backfills_in_flight": [],
>>>                 "recovering": [],
>>>                 "pg_backend": {
>>>                     "pull_from_peer": [],
>>>                     "pushing": []
>>>                 }
>>>             },
>>>             "scrub": {
>>>                 "scrubber.epoch_start": "22439",
>>>                 "scrubber.active": 0,
>>>                 "scrubber.state": "INACTIVE",
>>>                 "scrubber.start": "MIN",
>>>                 "scrubber.end": "MIN",
>>>                 "scrubber.subset_last_update": "0'0",
>>>                 "scrubber.deep": false,
>>>                 "scrubber.seed": 0,
>>>                 "scrubber.waiting_on": 0,
>>>                 "scrubber.waiting_on_whom": []
>>>             }
>>>         },
>>>         {
>>>             "name": "Started",
>>>             "enter_time": "2016-08-25 21:41:29.291162"
>>>         }
>>>     ],
>>>     "agent_state": {}
>>> }
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Goncalo Borges
>>> Research Computing
>>> ARC Centre of Excellence for Particle Physics at the Terascale
>>> School of Physics A28 | University of Sydney, NSW  2006
>>> T: +61 2 93511937
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>>
>> --
>> Cheers,
>> Brad
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com