Re: ONE pg deep-scrub blocks cluster

Mehmet <ceph@xxxxxxxxxx> · Fri, 26 Aug 2016 13:16:15 +0200

Hello JC,

as promised here is my

- ceph.conf (I have done a "diff" on all involved server - all using the 
same ceph.conf) = ceph_conf.txt
- ceph pg 0.223 query = ceph_pg_0223_query_20161236.txt
- ceph -s = ceph_s.txt
- ceph df = ceph_df.txt
- ceph osd df = ceph_osd_df.txt
- ceph osd dump | grep pool = ceph_osd_dump_pool.txt
- ceph osd crush rule dump = ceph_osd_crush_rule_dump.txt

as attached txt files.

I have done again a "ceph pg deep-scrub 0.223" before I have created the 
files above. The issue still exists today ~ 12:24 on ... :*(
The deep-scrub on this pg has taken ~14 minutes:

- 2016-08-26 12:24:01.463411 osd.9 172.16.0.11:6808/29391 1777 : cluster 
[INF] 0.223 deep-scrub starts
- 2016-08-26 12:38:07.201726 osd.9 172.16.0.11:6808/29391 2485 : cluster 
[INF] 0.223 deep-scrub ok

Ceph: version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
OS: Ubuntu 16.04 LTS (Linux osdserver1 4.4.0-31-generic #50-Ubuntu SMP 
Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux)

As a remark, assuming the size parameter of the rbd pool is set to 3, 
the number of PGs in your cluster should be higher

I know I could increase this to 2048 (with 30 OSDs). But perhaps we will 
create further Pools so I did not want to set this to high for this pool 
because it is not possible to decrease the pg for the pool.
Furthermore if I would change this now and the issue is gone, we would 
not know what the cause was... :)

When you need further informations, please do not hesitate to ask - I 
will provides this as soon as possible.
Please keep in mind that I have 1 additional disk on each OSD-Node (3 
disk in sum) which I can add to the cluster, so that the acting set for 
this pg could change.
I did removed this before to force other OSDs to become the acting set 
for pg 0.223.

Thank you, your help is very appreciated!

- Mehmet

Am 2016-08-25 13:58, schrieb ceph@xxxxxxxxxx:
Hey JC,

 Thank you very much for your mail!

 I will provide the Informations tomorrow when i am at work again.

 Hope that we will find a solution :)

 - Mehmet

Am 24. August 2016 16:58:58 MESZ, schrieb LOPEZ Jean-Charles
<jelopez@xxxxxxxxxx>:

Hi Mehmet,

I’m just seeing your message and read the thread going with it.

Can you please provide me with a copy of the ceph.conf file on the
MON and OSD side assuming it’s identical and if the ceph.conf file
is different on the client side (the VM side) can you please provide
me with a copy of it.

Can you also provide me as attached txt files with
output of your pg query of the pg 0.223?
output of ceph -s
output of ceph df
output of ceph osd df
output of ceph osd dump | grep pool
output of ceph osd crush rule dump

Thank you and I’ll see if I can get something to ease your pain.

As a remark, assuming the size parameter of the rbd pool is set to
3, the number of PGs in your cluster should be higher

If we manage to move forward and get it fixed, we will repost to the
mailing list the changes we made to your configuration.

Regards
JC

On Aug 24, 2016, at 06:41, Mehmet <ceph@xxxxxxxxxx> wrote:

Hello Guys,

the issue still exists :(

If we run a "ceph pg deep-scrub 0.223" nearly all VMs stop for a
while (blocked requests).

- we already replaced the OSDs (SAS Disks - journal on NVMe)
- Removed OSDs so that acting set for pg 0.223 has changed
- checked the filesystem on the acting OSDs
- changed the tunables back from jewel to default
- changed the tunables again to jewel from default
- done a deep-scrub on the hole OSDs (ceph osd deep-scrub osd.<id>)
- only when a deeph-scrub on pg 0.223 runs we get blocked requests

The deep-scrub on pg 0.223 took always 13-15 Min. to finish. It
does not matter which OSDs are in the acting set for this pg.

So, i dont have any ideas what coul d be the issue for this.

As long as "ceph osd set nodeep-scrub" is set - so that no
deep-scrub on 0.223 is running - the cluster is fine!

Could this be a bug?

ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
Kernel: 4.4.0-31-generic #50-Ubuntu

Any ideas?
- Mehmet

Am 2016-08-02 17:57, schrieb c:
Am 2016-08-02 13:30, schrieb c:
Hello Guys,
this time without the original acting-set osd.4, 16 and 28. The
issue
still exists...
[...]
For the record, this ONLY happens with this PG and no others that
share
the same OSDs, right?
Yes, right.
 [...]

When doing the deep-scrub, monitor (atop, etc) all 3 nodes and
see if a
particular OSD (HDD) stands out, as I would expect it to.
Now I logged all disks via atop each 2 seconds while the deep-scrub
was running ( atop -w osdXX_atop 2 ).
As you expected all disks was 100% busy - with constant 150MB
(osd.4), 130MB (osd.28) and 170MB (osd.16)...
- osd.4 (/dev/sdf) http://slexy.org/view/s21emd2u6j [1] [1]
- osd.16 (/dev/sdm): http://slexy.org/view/s20vukWz5E [2] [2]
- osd.28 (/dev/sdh): http://slexy.org/view/s20YX0lzZY [3] [3]
[...]
But what is causing this? A deep-scrub on all other disks - same
model and ordered at the same time - seems to not have this issue.
 [...]

Next week, I will do this
1.1 Remove osd.4 completely from Ceph - again (the actual primary
for PG 0.223)
 osd.4 is now removed completely.
 The Primary PG is now on "osd.9"
 # ceph pg map 0.223
 osdmap e8671 pg 0.223 (0.223) -> up [9,16,28] acting [9,16,28]

1.2 xfs_repair -n /dev/sdf1 (osd.4): to see possible error
 xfs_repair did not find/show any error

1.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.4,16,28 injectargs "--debug_osd 5/5"
 Because now osd.9 is the Primary PG i have set the debug_osd on this
too:
 ceph tell osd.9 injectargs "--debug_osd 5/5"
 and run the deep-scrub on 0.223 (and againg nearly all of my VMs stop
 working for a while)
 Start @ 15:33:27
 End @ 15:48:31
 The "ceph.log"
 - http://slexy.org/view/s2WbdApDLz [5]
 The related LogFiles (OSDs 9,16 and 28) and the LogFile via atop for
the osds
 LogFile - osd.9 (/dev/sdk)
 - ceph-osd.9.log: http://slexy.org/view/s2kXeLMQyw [6]
 - atop Log: http://slexy.org/view/s21wJG2qr8 [7]
 LogFile - osd.16 (/dev/sdh)
 - ceph-osd.16.log: http://slexy.org/view/s20D6WhD4d [8]
 - atop Log: http://slexy.org/view/s2iMjer8rC [9]
 LogFile - osd.28 (/dev/sdm)
 - ceph-osd.28.log: http://slexy.org/view/s21dmXoEo7 [10]
 - atop log: http://slexy.org/view/s2gJqzu3uG [11]

2.1 Remove osd.16 completely from Ceph
 osd.16 is now removed completely - now replaced with osd.17 witihin
 the acting set.
 # ceph pg map 0.223
 osdmap e9017 pg 0.223 (0.223) -> up [9,17,28] acting [9,17,28]

2.2 xfs_repair -n /dev/sdh1
 xfs_repair did not find/show any error

2.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.9,17,28 injectargs "--debug_osd 5/5"
 and run the deep-scrub on 0.223 (and againg nearly all of my VMs stop
 working for a while)
 Start @ 2016-08-02 10:02:44
 End @ 2016-08-02 10:17:22
 The "Ceph.log": http://slexy.org/view/s2ED5LvuV2 [12]
 LogFile - osd.9 (/dev/sdk)
 - ceph-osd.9.log: http://slexy.org/view/s21z9JmwSu [13]
 - atop Log: http://slexy.org/view/s20XjFZFEL [14]
 LogFile - osd.17 (/dev/sdi)
 - ceph-osd.17.log: http://slexy.org/view/s202fpcZS9 [15]
 - atop Log: http://slexy.org/view/s2TxeR1JSz [16]
 LogFile - osd.28 (/dev/sdm)
 - ceph-osd.28.log: http://slexy.org/view/s2eCUyC7xV [17]
 - atop log: http://slexy.org/view/s21AfebBqK [18]

3.1 Remove osd.28 completely from Ceph
 Now osd.28 is also removed completely from Ceph - now replaced with
osd.23
 # ceph pg map 0.223
 osdmap e9363 pg 0.223 (0.223) -> up [9,17,23] acting [9,17,23]

3.2 xfs_repair -n /dev/sdm1
 As expected: xfs_repair did not find/show any error

3.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.9,17,23 injectargs "--debug_osd 5/5"
 ... againg nearly all of my VMs stop working for a while...
 Now are all "original" OSDs (4,16,28) removed which was in the
 acting-set when i wrote my first eMail to this mailinglist. But the
 issue still exists with different OSDs (9,17,23) as the acting-set
 while the questionable PG 0.223 is still the same!
 In suspicion that the "tunable" could be the cause, i have now
changed
 this back to "default" via " ceph osd crush tunables default ".
 This will take a whille... then i will do " ceph pg deep-scrub 0.223
"
 again (without osds 4,16,28)...
 Really, i do not know whats going on here.
 Ceph finished its recovering to "default" tunables but the issue
still
 exists!:*(
 The acting set has changed again
 # ceph pg map 0.223
 osdmap e11230 pg 0.223 (0.223) -> up [9,11,20] acting [9,11,20]
 But when i start " ceph pg deep-scrub 0.223 ", again nearly all of my
 VMs stop working for a while!
 Does any one have an idea where i should have a look to find the
cause for this?
 It seems that everytime the Primary OSD from the acting set of PG
 0.223 (*4*,16,28; *9*,17,23 or *9*,11,20) leads to "currently waiting
 for subops from 9,X" and the deep-scrub takes always nearly 15
minutes
 to finish.
 My output from " ceph pg 0.223 query "
 - http://slexy.org/view/s21d6qUqnV [19]
 Mehmet

For the records: Although nearly all disks are busy i have no
slow/blocked requests and i am watching the logfiles for nearly 20
minutes now...
Your help is realy appreciated!
- Mehmet

-------------------------

ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [4]

-------------------------

 ceph-users mailing list
 ceph-users@xxxxxxxxxxxxxx
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [4]

-------------------------

 ceph-users mailing list
 ceph-users@xxxxxxxxxxxxxx
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [4]

JC Lopez
S. Technical Instructor, Global Storage Consulting Practice
Red Hat, Inc.
jelopez@xxxxxxxxxx
+1 408-680-6959

Links:
------
[1] http://slexy.org/view/s21emd2u6j
[2] http://slexy.org/view/s20vukWz5E
[3] http://slexy.org/view/s20YX0lzZY
[4] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[5] http://slexy.org/view/s2WbdApDLz
[6] http://slexy.org/view/s2kXeLMQyw
[7] http://slexy.org/view/s21wJG2qr8
[8] http://slexy.org/view/s20D6WhD4d
[9] http://slexy.org/view/s2iMjer8rC
[10] http://slexy.org/view/s21dmXoEo7
[11] http://slexy.org/view/s2gJqzu3uG
[12] http://slexy.org/view/s2ED5LvuV2
[13] http://slexy.org/view/s21z9JmwSu
[14] http://slexy.org/view/s20XjFZFEL
[15] http://slexy.org/view/s202fpcZS9
[16] http://slexy.org/view/s2TxeR1JSz
[17] http://slexy.org/view/s2eCUyC7xV
[18] http://slexy.org/view/s21AfebBqK
[19] http://slexy.org/view/s21d6qUqnV

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[global]
fsid = 98a410bf-b823-47e4-ad17-4543afa24992
public_network = 172.16.0.0/24
cluster_network = 10.0.0.0/24
mon_initial_members = monserver1, monserver2, monserver3
mon_host = 172.16.0.2,172.16.0.3,172.16.0.4
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    93660G     85076G        8583G          9.16
POOLS:
    NAME     ID     USED      %USED     MAX AVAIL     OBJECTS
    rbd      0      2859G      9.16        27188G      722239
[
    {
        "rule_id": 0,
        "rule_name": "replicated_ruleset",
        "ruleset": 0,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -1,
                "item_name": "default"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    }
]
ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS
 0 3.63689  1.00000  3724G   437G  3286G 11.75 1.28 162
 2 0.13350  1.00000   136G 16736M   120G 11.96 1.30   6
 3 3.63689  1.00000  3724G   356G  3368G  9.56 1.04 131
 5 3.63689  1.00000  3724G   396G  3327G 10.64 1.16 147
 6 0.40919  1.00000   418G 41281M   378G  9.62 1.05  15
 7 3.63689  1.00000  3724G   363G  3360G  9.77 1.07 135
 8 3.63689  1.00000  3724G   390G  3333G 10.48 1.14 146
 9 3.63689  1.00000  3724G   480G  3243G 12.91 1.41 141
 1 3.63689  1.00000  3724G   380G  3344G 10.21 1.11 141
20 3.63689  1.00000  3724G   354G  3370G  9.51 1.04 131
21 3.63689  1.00000  3724G   285G  3438G  7.68 0.84 106
22 3.63689  1.00000  3724G   296G  3427G  7.96 0.87 110
23 3.63689  1.00000  3724G   391G  3333G 10.50 1.15 108
24 3.63689  1.00000  3724G   310G  3414G  8.33 0.91 115
25 3.63689  1.00000  3724G   325G  3398G  8.75 0.95 121
26 3.63689  1.00000  3724G   302G  3422G  8.11 0.89 112
27 3.63689  1.00000  3724G   352G  3371G  9.47 1.03 131
29 3.63689  1.00000  3724G   242G  3481G  6.51 0.71  90
10 3.63689  1.00000  3724G   271G  3452G  7.29 0.79 101
11 3.63689  1.00000  3724G   352G  3372G  9.46 1.03 131
12 3.63689  1.00000  3724G   318G  3405G  8.55 0.93 118
13 3.63689  1.00000  3724G   268G  3455G  7.20 0.79 100
14 3.63689  1.00000  3724G   333G  3390G  8.96 0.98 123
15 3.63689  1.00000  3724G   331G  3392G  8.91 0.97 123
17 3.63689  1.00000  3724G   437G  3286G 11.74 1.28 125
18 3.63689  1.00000  3724G   293G  3430G  7.89 0.86 109
19 3.63689  1.00000  3724G   254G  3470G  6.82 0.74  94
              TOTAL 93660G  8583G 85077G  9.16
MIN/MAX VAR: 0.71/1.41  STDDEV: 1.62
pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 13191 flags hashpspool stripe_width 0
{
    "state": "active+clean",
    "snap_trimq": "[]",
    "epoch": 13192,
    "up": [
        9,
        17,
        23
    ],
    "acting": [
        9,
        17,
        23
    ],
    "actingbackfill": [
        "9",
        "17",
        "23"
    ],
    "info": {
        "pgid": "0.223",
        "last_update": "13192'992381",
        "last_complete": "13192'992381",
        "log_tail": "13191'989301",
        "last_user_version": 992381,
        "last_backfill": "MAX",
        "last_backfill_bitwise": 1,
        "purged_snaps": "[1~7]",
        "history": {
            "epoch_created": 141,
            "last_epoch_started": 13189,
            "last_epoch_clean": 13189,
            "last_epoch_split": 0,
            "last_epoch_marked_full": 0,
            "same_up_since": 13187,
            "same_interval_since": 13188,
            "same_primary_since": 13178,
            "last_scrub": "13192'992300",
            "last_scrub_stamp": "2016-08-26 12:38:07.201730",
            "last_deep_scrub": "13192'992300",
            "last_deep_scrub_stamp": "2016-08-26 12:38:07.201730",
            "last_clean_scrub_stamp": "2016-08-26 12:38:07.201730"
        },
        "stats": {
            "version": "13192'992381",
            "reported_seq": "943611",
            "reported_epoch": "13192",
            "state": "active+clean",
            "last_fresh": "2016-08-26 12:39:37.908885",
            "last_change": "2016-08-26 12:38:07.201781",
            "last_active": "2016-08-26 12:39:37.908885",
            "last_peered": "2016-08-26 12:39:37.908885",
            "last_clean": "2016-08-26 12:39:37.908885",
            "last_became_active": "2016-08-05 15:33:26.796811",
            "last_became_peered": "2016-08-05 15:33:26.796811",
            "last_unstale": "2016-08-26 12:39:37.908885",
            "last_undegraded": "2016-08-26 12:39:37.908885",
            "last_fullsized": "2016-08-26 12:39:37.908885",
            "mapping_epoch": 13187,
            "log_start": "13191'989301",
            "ondisk_log_start": "13191'989301",
            "created": 141,
            "last_epoch_clean": 13189,
            "parent": "0.0",
            "parent_split_bits": 10,
            "last_scrub": "13192'992300",
            "last_scrub_stamp": "2016-08-26 12:38:07.201730",
            "last_deep_scrub": "13192'992300",
            "last_deep_scrub_stamp": "2016-08-26 12:38:07.201730",
            "last_clean_scrub_stamp": "2016-08-26 12:38:07.201730",
            "log_size": 3080,
            "ondisk_log_size": 3080,
            "stats_invalid": false,
            "dirty_stats_invalid": false,
            "omap_stats_invalid": false,
            "hitset_stats_invalid": false,
            "hitset_bytes_stats_invalid": false,
            "pin_stats_invalid": false,
            "stat_sum": {
                "num_bytes": 110264971264,
                "num_objects": 704,
                "num_object_clones": 0,
                "num_object_copies": 2112,
                "num_objects_missing_on_primary": 0,
                "num_objects_missing": 0,
                "num_objects_degraded": 0,
                "num_objects_misplaced": 0,
                "num_objects_unfound": 0,
                "num_objects_dirty": 704,
                "num_whiteouts": 0,
                "num_read": 316036,
                "num_read_kb": 16344444,
                "num_write": 811423,
                "num_write_kb": 122745012,
                "num_scrub_errors": 0,
                "num_shallow_scrub_errors": 0,
                "num_deep_scrub_errors": 0,
                "num_objects_recovered": 14976,
                "num_bytes_recovered": 2531158680576,
                "num_keys_recovered": 0,
                "num_objects_omap": 0,
                "num_objects_hit_set_archive": 0,
                "num_bytes_hit_set_archive": 0,
                "num_flush": 0,
                "num_flush_kb": 0,
                "num_evict": 0,
                "num_evict_kb": 0,
                "num_promote": 0,
                "num_flush_mode_high": 0,
                "num_flush_mode_low": 0,
                "num_evict_mode_some": 0,
                "num_evict_mode_full": 0,
                "num_objects_pinned": 0
            },
            "up": [
                9,
                17,
                23
            ],
            "acting": [
                9,
                17,
                23
            ],
            "blocked_by": [],
            "up_primary": 9,
            "acting_primary": 9
        },
        "empty": 0,
        "dne": 0,
        "incomplete": 0,
        "last_epoch_started": 13189,
        "hit_set_history": {
            "current_last_update": "0'0",
            "history": []
        }
    },
    "peer_info": [
        {
            "peer": "17",
            "pgid": "0.223",
            "last_update": "13192'992381",
            "last_complete": "13192'992381",
            "log_tail": "13121'516101",
            "last_user_version": 519199,
            "last_backfill": "MAX",
            "last_backfill_bitwise": 1,
            "purged_snaps": "[1~5]",
            "history": {
                "epoch_created": 141,
                "last_epoch_started": 13189,
                "last_epoch_clean": 13189,
                "last_epoch_split": 0,
                "last_epoch_marked_full": 0,
                "same_up_since": 13187,
                "same_interval_since": 13188,
                "same_primary_since": 13178,
                "last_scrub": "13192'992300",
                "last_scrub_stamp": "2016-08-26 12:38:07.201730",
                "last_deep_scrub": "13192'992300",
                "last_deep_scrub_stamp": "2016-08-26 12:38:07.201730",
                "last_clean_scrub_stamp": "2016-08-26 12:38:07.201730"
            },
            "stats": {
                "version": "13184'519198",
                "reported_seq": "492177",
                "reported_epoch": "13184",
                "state": "active+clean",
                "last_fresh": "2016-08-05 15:33:07.904695",
                "last_change": "2016-08-05 15:28:27.924648",
                "last_active": "2016-08-05 15:33:07.904695",
                "last_peered": "2016-08-05 15:33:07.904695",
                "last_clean": "2016-08-05 15:33:07.904695",
                "last_became_active": "2016-08-05 15:28:27.924429",
                "last_became_peered": "2016-08-05 15:28:27.924429",
                "last_unstale": "2016-08-05 15:33:07.904695",
                "last_undegraded": "2016-08-05 15:33:07.904695",
                "last_fullsized": "2016-08-05 15:33:07.904695",
                "mapping_epoch": 13187,
                "log_start": "13121'516101",
                "ondisk_log_start": "13121'516101",
                "created": 141,
                "last_epoch_clean": 13182,
                "parent": "0.0",
                "parent_split_bits": 10,
                "last_scrub": "13122'517231",
                "last_scrub_stamp": "2016-08-05 11:58:28.041132",
                "last_deep_scrub": "13122'517231",
                "last_deep_scrub_stamp": "2016-08-05 11:58:28.041132",
                "last_clean_scrub_stamp": "2016-08-05 11:58:28.041132",
                "log_size": 3097,
                "ondisk_log_size": 3097,
                "stats_invalid": false,
                "dirty_stats_invalid": false,
                "omap_stats_invalid": false,
                "hitset_stats_invalid": false,
                "hitset_bytes_stats_invalid": false,
                "pin_stats_invalid": false,
                "stat_sum": {
                    "num_bytes": 110265495552,
                    "num_objects": 712,
                    "num_object_clones": 0,
                    "num_object_copies": 2136,
                    "num_objects_missing_on_primary": 0,
                    "num_objects_missing": 0,
                    "num_objects_degraded": 0,
                    "num_objects_misplaced": 0,
                    "num_objects_unfound": 0,
                    "num_objects_dirty": 712,
                    "num_whiteouts": 0,
                    "num_read": 186137,
                    "num_read_kb": 9541269,
                    "num_write": 338257,
                    "num_write_kb": 114199805,
                    "num_scrub_errors": 0,
                    "num_shallow_scrub_errors": 0,
                    "num_deep_scrub_errors": 0,
                    "num_objects_recovered": 14976,
                    "num_bytes_recovered": 2531158680576,
                    "num_keys_recovered": 0,
                    "num_objects_omap": 0,
                    "num_objects_hit_set_archive": 0,
                    "num_bytes_hit_set_archive": 0,
                    "num_flush": 0,
                    "num_flush_kb": 0,
                    "num_evict": 0,
                    "num_evict_kb": 0,
                    "num_promote": 0,
                    "num_flush_mode_high": 0,
                    "num_flush_mode_low": 0,
                    "num_evict_mode_some": 0,
                    "num_evict_mode_full": 0,
                    "num_objects_pinned": 0
                },
                "up": [
                    9,
                    17,
                    23
                ],
                "acting": [
                    9,
                    17,
                    23
                ],
                "blocked_by": [],
                "up_primary": 9,
                "acting_primary": 9
            },
            "empty": 0,
            "dne": 0,
            "incomplete": 0,
            "last_epoch_started": 13189,
            "hit_set_history": {
                "current_last_update": "0'0",
                "history": []
            }
        },
        {
            "peer": "23",
            "pgid": "0.223",
            "last_update": "13192'992381",
            "last_complete": "13192'992381",
            "log_tail": "13121'516101",
            "last_user_version": 519199,
            "last_backfill": "MAX",
            "last_backfill_bitwise": 1,
            "purged_snaps": "[1~5]",
            "history": {
                "epoch_created": 141,
                "last_epoch_started": 13189,
                "last_epoch_clean": 13189,
                "last_epoch_split": 0,
                "last_epoch_marked_full": 0,
                "same_up_since": 13187,
                "same_interval_since": 13188,
                "same_primary_since": 13178,
                "last_scrub": "13192'992300",
                "last_scrub_stamp": "2016-08-26 12:38:07.201730",
                "last_deep_scrub": "13192'992300",
                "last_deep_scrub_stamp": "2016-08-26 12:38:07.201730",
                "last_clean_scrub_stamp": "2016-08-26 12:38:07.201730"
            },
            "stats": {
                "version": "13184'519198",
                "reported_seq": "492177",
                "reported_epoch": "13184",
                "state": "active+clean",
                "last_fresh": "2016-08-05 15:33:07.904695",
                "last_change": "2016-08-05 15:28:27.924648",
                "last_active": "2016-08-05 15:33:07.904695",
                "last_peered": "2016-08-05 15:33:07.904695",
                "last_clean": "2016-08-05 15:33:07.904695",
                "last_became_active": "2016-08-05 15:28:27.924429",
                "last_became_peered": "2016-08-05 15:28:27.924429",
                "last_unstale": "2016-08-05 15:33:07.904695",
                "last_undegraded": "2016-08-05 15:33:07.904695",
                "last_fullsized": "2016-08-05 15:33:07.904695",
                "mapping_epoch": 13187,
                "log_start": "13121'516101",
                "ondisk_log_start": "13121'516101",
                "created": 141,
                "last_epoch_clean": 13182,
                "parent": "0.0",
                "parent_split_bits": 10,
                "last_scrub": "13122'517231",
                "last_scrub_stamp": "2016-08-05 11:58:28.041132",
                "last_deep_scrub": "13122'517231",
                "last_deep_scrub_stamp": "2016-08-05 11:58:28.041132",
                "last_clean_scrub_stamp": "2016-08-05 11:58:28.041132",
                "log_size": 3097,
                "ondisk_log_size": 3097,
                "stats_invalid": false,
                "dirty_stats_invalid": false,
                "omap_stats_invalid": false,
                "hitset_stats_invalid": false,
                "hitset_bytes_stats_invalid": false,
                "pin_stats_invalid": false,
                "stat_sum": {
                    "num_bytes": 110265495552,
                    "num_objects": 712,
                    "num_object_clones": 0,
                    "num_object_copies": 2136,
                    "num_objects_missing_on_primary": 0,
                    "num_objects_missing": 0,
                    "num_objects_degraded": 0,
                    "num_objects_misplaced": 0,
                    "num_objects_unfound": 0,
                    "num_objects_dirty": 712,
                    "num_whiteouts": 0,
                    "num_read": 186137,
                    "num_read_kb": 9541269,
                    "num_write": 338257,
                    "num_write_kb": 114199805,
                    "num_scrub_errors": 0,
                    "num_shallow_scrub_errors": 0,
                    "num_deep_scrub_errors": 0,
                    "num_objects_recovered": 14976,
                    "num_bytes_recovered": 2531158680576,
                    "num_keys_recovered": 0,
                    "num_objects_omap": 0,
                    "num_objects_hit_set_archive": 0,
                    "num_bytes_hit_set_archive": 0,
                    "num_flush": 0,
                    "num_flush_kb": 0,
                    "num_evict": 0,
                    "num_evict_kb": 0,
                    "num_promote": 0,
                    "num_flush_mode_high": 0,
                    "num_flush_mode_low": 0,
                    "num_evict_mode_some": 0,
                    "num_evict_mode_full": 0,
                    "num_objects_pinned": 0
                },
                "up": [
                    9,
                    17,
                    23
                ],
                "acting": [
                    9,
                    17,
                    23
                ],
                "blocked_by": [],
                "up_primary": 9,
                "acting_primary": 9
            },
            "empty": 0,
            "dne": 0,
            "incomplete": 0,
            "last_epoch_started": 13189,
            "hit_set_history": {
                "current_last_update": "0'0",
                "history": []
            }
        }
    ],
    "recovery_state": [
        {
            "name": "Started\/Primary\/Active",
            "enter_time": "2016-08-05 15:33:26.731212",
            "might_have_unfound": [],
            "recovery_progress": {
                "backfill_targets": [],
                "waiting_on_backfill": [],
                "last_backfill_started": "MIN",
                "backfill_info": {
                    "begin": "MIN",
                    "end": "MIN",
                    "objects": []
                },
                "peer_backfill_info": [],
                "backfills_in_flight": [],
                "recovering": [],
                "pg_backend": {
                    "pull_from_peer": [],
                    "pushing": []
                }
            },
            "scrub": {
                "scrubber.epoch_start": "13188",
                "scrubber.active": 0,
                "scrubber.state": "INACTIVE",
                "scrubber.start": "MIN",
                "scrubber.end": "MIN",
                "scrubber.subset_last_update": "0'0",
                "scrubber.deep": false,
                "scrubber.seed": 0,
                "scrubber.waiting_on": 0,
                "scrubber.waiting_on_whom": []
            }
        },
        {
            "name": "Started",
            "enter_time": "2016-08-05 15:33:25.721344"
        }
    ],
    "agent_state": {}
}
    cluster 98a410bf-b823-47e4-ad17-4543afa24992
     health HEALTH_OK
     monmap e2: 3 mons at {monserver1=172.16.0.2:6789/0,monserver3=172.16.0.4:6789/0,monserver2=172.16.0.3:6789/0}
            election epoch 50, quorum 0,1,2 monserver1,monserver2,monserver3
     osdmap e13192: 27 osds: 27 up, 27 in
            flags sortbitwise
      pgmap v5725766: 1024 pgs, 1 pools, 2859 GB data, 704 kobjects
            8583 GB used, 85076 GB / 93660 GB avail
                1024 active+clean
  client io 618 kB/s wr, 0 op/s rd, 393 op/s wr
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com