ceph cluster inconsistency keyvaluestore

haomaiwang@xxxxxxxxx (Haomai Wang) · Sun, 7 Sep 2014 20:02:40 +0800

A issue already registered at http://tracker.ceph.com/issues/8589

On Sun, Sep 7, 2014 at 8:00 PM, Haomai Wang <haomaiwang at gmail.com> wrote:

> I have found the root cause. It's a bug.
>
> When chunky scrub happen, it will iterate the who pg's objects and each
> iterator only a few objects will be scan.
>
> osd/PG.cc:3758
>             ret = get_pgbackend()-> objects_list_partial(
>       start,
>       cct->_conf->osd_scrub_chunk_min,
>       cct->_conf->osd_scrub_chunk_max,
>       0,
>       &objects,
>       &candidate_end);
>
> candidate_end is the end of object set and it's used to indicate the next
> scrub process's start position. But it will be truncated:
>
> osd/PG.cc:3777
>             while (!boundary_found && objects.size() > 1) {
>               hobject_t end = objects.back().get_boundary();
>               objects.pop_back();
>
>               if (objects.back().get_filestore_key() !=
> end.get_filestore_key()) {
>                 candidate_end = end;
>                 boundary_found = true;
>               }
>             }
> end which only contain "hash" field as hobject_t will be assign to
> candidate_end.  So the next scrub process a hobject_t only contains "hash"
> field will be passed in to get_pgbackend()-> objects_list_partial.
>
> It will cause incorrect results for KeyValueStore backend. Because it will
> use strict key ordering for "collection_list_paritial" method. A hobject_t
> only contains "hash" field will be:
>
> 1%e79s0_head!972F1B5D!!none!!!00000000000000000000!0!0
>
> and the actual object is
> 1%e79s0_head!972F1B5D!!1!!!object-name!head
>
> In other word, a object only contain "hash" field can't used by to search
> a absolute object has the same "hash" field.
>
> @sage, I simply scan the usage of "get_boundary" and can't find the
> reason. Could we simply remove it and the results will be:
>
>             while (!boundary_found && objects.size() > 1) {
>               hobject_t end = objects.back();
>               objects.pop_back();
>
>               if (objects.back().get_filestore_key() !=
> end.get_filestore_key()) {
>                 candidate_end = end;
>                 boundary_found = true;
>               }
>             }
>
>
>
> On Sat, Sep 6, 2014 at 10:44 PM, Haomai Wang <haomaiwang at gmail.com> wrote:
>
>> Sorry for the late message, I'm back from a short vacation. I would
>> like to try it this weekends. Thanks for your patient :-)
>>
>> On Wed, Sep 3, 2014 at 9:16 PM, Kenneth Waegeman
>> <Kenneth.Waegeman at ugent.be> wrote:
>> > I also can reproduce it on a new slightly different set up (also EC on
>> KV
>> > and Cache) by running ceph pg scrub on a KV pg: this pg will then get
>> the
>> > 'inconsistent' status
>> >
>> >
>> >
>> > ----- Message from Kenneth Waegeman <Kenneth.Waegeman at UGent.be>
>> ---------
>> >    Date: Mon, 01 Sep 2014 16:28:31 +0200
>> >    From: Kenneth Waegeman <Kenneth.Waegeman at UGent.be>
>> > Subject: Re: ceph cluster inconsistency keyvaluestore
>> >      To: Haomai Wang <haomaiwang at gmail.com>
>> >      Cc: ceph-users at lists.ceph.com
>> >
>> >
>> >
>> >> Hi,
>> >>
>> >>
>> >> The cluster got installed with quattor, which uses ceph-deploy for
>> >> installation of daemons, writes the config file and installs the
>> crushmap.
>> >> I have 3 hosts, each 12 disks, having a large KV partition (3.6T) for
>> the
>> >> ECdata pool and a small cache partition (50G) for the cache
>> >>
>> >> I manually did this:
>> >>
>> >> ceph osd pool create cache 1024 1024
>> >> ceph osd pool set cache size 2
>> >> ceph osd pool set cache min_size 1
>> >> ceph osd erasure-code-profile set profile11 k=8 m=3
>> >> ruleset-failure-domain=osd
>> >> ceph osd pool create ecdata 128 128 erasure profile11
>> >> ceph osd tier add ecdata cache
>> >> ceph osd tier cache-mode cache writeback
>> >> ceph osd tier set-overlay ecdata cache
>> >> ceph osd pool set cache hit_set_type bloom
>> >> ceph osd pool set cache hit_set_count 1
>> >> ceph osd pool set cache hit_set_period 3600
>> >> ceph osd pool set cache target_max_bytes $((280*1024*1024*1024))
>> >>
>> >> (But the previous time I had the problem already without the cache
>> part)
>> >>
>> >>
>> >>
>> >> Cluster live since 2014-08-29 15:34:16
>> >>
>> >> Config file on host ceph001:
>> >>
>> >> [global]
>> >> auth_client_required = cephx
>> >> auth_cluster_required = cephx
>> >> auth_service_required = cephx
>> >> cluster_network = 10.143.8.0/24
>> >> filestore_xattr_use_omap = 1
>> >> fsid = 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>> >> mon_cluster_log_to_syslog = 1
>> >> mon_host = ceph001.cubone.os, ceph002.cubone.os, ceph003.cubone.os
>> >> mon_initial_members = ceph001, ceph002, ceph003
>> >> osd_crush_update_on_start = 0
>> >> osd_journal_size = 10240
>> >> osd_pool_default_min_size = 2
>> >> osd_pool_default_pg_num = 512
>> >> osd_pool_default_pgp_num = 512
>> >> osd_pool_default_size = 3
>> >> public_network = 10.141.8.0/24
>> >>
>> >> [osd.11]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.13]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.15]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.17]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.19]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.21]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.23]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.25]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.3]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.5]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.7]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >> [osd.9]
>> >> osd_objectstore = keyvaluestore-dev
>> >>
>> >>
>> >> OSDs:
>> >> # id    weight  type name       up/down reweight
>> >> -12     140.6   root default-cache
>> >> -9      46.87           host ceph001-cache
>> >> 2       3.906                   osd.2   up      1
>> >> 4       3.906                   osd.4   up      1
>> >> 6       3.906                   osd.6   up      1
>> >> 8       3.906                   osd.8   up      1
>> >> 10      3.906                   osd.10  up      1
>> >> 12      3.906                   osd.12  up      1
>> >> 14      3.906                   osd.14  up      1
>> >> 16      3.906                   osd.16  up      1
>> >> 18      3.906                   osd.18  up      1
>> >> 20      3.906                   osd.20  up      1
>> >> 22      3.906                   osd.22  up      1
>> >> 24      3.906                   osd.24  up      1
>> >> -10     46.87           host ceph002-cache
>> >> 28      3.906                   osd.28  up      1
>> >> 30      3.906                   osd.30  up      1
>> >> 32      3.906                   osd.32  up      1
>> >> 34      3.906                   osd.34  up      1
>> >> 36      3.906                   osd.36  up      1
>> >> 38      3.906                   osd.38  up      1
>> >> 40      3.906                   osd.40  up      1
>> >> 42      3.906                   osd.42  up      1
>> >> 44      3.906                   osd.44  up      1
>> >> 46      3.906                   osd.46  up      1
>> >> 48      3.906                   osd.48  up      1
>> >> 50      3.906                   osd.50  up      1
>> >> -11     46.87           host ceph003-cache
>> >> 54      3.906                   osd.54  up      1
>> >> 56      3.906                   osd.56  up      1
>> >> 58      3.906                   osd.58  up      1
>> >> 60      3.906                   osd.60  up      1
>> >> 62      3.906                   osd.62  up      1
>> >> 64      3.906                   osd.64  up      1
>> >> 66      3.906                   osd.66  up      1
>> >> 68      3.906                   osd.68  up      1
>> >> 70      3.906                   osd.70  up      1
>> >> 72      3.906                   osd.72  up      1
>> >> 74      3.906                   osd.74  up      1
>> >> 76      3.906                   osd.76  up      1
>> >> -8      140.6   root default-ec
>> >> -5      46.87           host ceph001-ec
>> >> 3       3.906                   osd.3   up      1
>> >> 5       3.906                   osd.5   up      1
>> >> 7       3.906                   osd.7   up      1
>> >> 9       3.906                   osd.9   up      1
>> >> 11      3.906                   osd.11  up      1
>> >> 13      3.906                   osd.13  up      1
>> >> 15      3.906                   osd.15  up      1
>> >> 17      3.906                   osd.17  up      1
>> >> 19      3.906                   osd.19  up      1
>> >> 21      3.906                   osd.21  up      1
>> >> 23      3.906                   osd.23  up      1
>> >> 25      3.906                   osd.25  up      1
>> >> -6      46.87           host ceph002-ec
>> >> 29      3.906                   osd.29  up      1
>> >> 31      3.906                   osd.31  up      1
>> >> 33      3.906                   osd.33  up      1
>> >> 35      3.906                   osd.35  up      1
>> >> 37      3.906                   osd.37  up      1
>> >> 39      3.906                   osd.39  up      1
>> >> 41      3.906                   osd.41  up      1
>> >> 43      3.906                   osd.43  up      1
>> >> 45      3.906                   osd.45  up      1
>> >> 47      3.906                   osd.47  up      1
>> >> 49      3.906                   osd.49  up      1
>> >> 51      3.906                   osd.51  up      1
>> >> -7      46.87           host ceph003-ec
>> >> 55      3.906                   osd.55  up      1
>> >> 57      3.906                   osd.57  up      1
>> >> 59      3.906                   osd.59  up      1
>> >> 61      3.906                   osd.61  up      1
>> >> 63      3.906                   osd.63  up      1
>> >> 65      3.906                   osd.65  up      1
>> >> 67      3.906                   osd.67  up      1
>> >> 69      3.906                   osd.69  up      1
>> >> 71      3.906                   osd.71  up      1
>> >> 73      3.906                   osd.73  up      1
>> >> 75      3.906                   osd.75  up      1
>> >> 77      3.906                   osd.77  up      1
>> >> -4      23.44   root default-ssd
>> >> -1      7.812           host ceph001-ssd
>> >> 0       3.906                   osd.0   up      1
>> >> 1       3.906                   osd.1   up      1
>> >> -2      7.812           host ceph002-ssd
>> >> 26      3.906                   osd.26  up      1
>> >> 27      3.906                   osd.27  up      1
>> >> -3      7.812           host ceph003-ssd
>> >> 52      3.906                   osd.52  up      1
>> >> 53      3.906                   osd.53  up      1
>> >>
>> >> Cache OSDs are each 50G, the EC KV OSDS 3.6T, (ssds not used right now)
>> >>
>> >> Pools:
>> >> pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash
>> >> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool
>> stripe_width 0
>> >> pool 1 'cache' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>> >> rjenkins pg_num 1024 pgp_num 1024 last_change 174 flags
>> >> hashpspool,incomplete_clones tier_of 2 cache_mode writeback
>> target_bytes
>> >> 300647710720 hit_set bloom{false_positive_probability: 0.05,
>> target_size: 0,
>> >> seed: 0} 3600s x1 stripe_width 0
>> >> pool 2 'ecdata' erasure size 11 min_size 8 crush_ruleset 2 object_hash
>> >> rjenkins pg_num 128 pgp_num 128 last_change 170 lfor 170 flags
>> hashpspool
>> >> tiers 1 read_tier 1 write_tier 1 stripe_width 4096
>> >>
>> >>
>> >> Crushmap:
>> >> # begin crush map
>> >> tunable choose_local_fallback_tries 0
>> >> tunable choose_local_tries 0
>> >> tunable choose_total_tries 50
>> >> tunable chooseleaf_descend_once 1
>> >>
>> >> # devices
>> >> device 0 osd.0
>> >> device 1 osd.1
>> >> device 2 osd.2
>> >> device 3 osd.3
>> >> device 4 osd.4
>> >> device 5 osd.5
>> >> device 6 osd.6
>> >> device 7 osd.7
>> >> device 8 osd.8
>> >> device 9 osd.9
>> >> device 10 osd.10
>> >> device 11 osd.11
>> >> device 12 osd.12
>> >> device 13 osd.13
>> >> device 14 osd.14
>> >> device 15 osd.15
>> >> device 16 osd.16
>> >> device 17 osd.17
>> >> device 18 osd.18
>> >> device 19 osd.19
>> >> device 20 osd.20
>> >> device 21 osd.21
>> >> device 22 osd.22
>> >> device 23 osd.23
>> >> device 24 osd.24
>> >> device 25 osd.25
>> >> device 26 osd.26
>> >> device 27 osd.27
>> >> device 28 osd.28
>> >> device 29 osd.29
>> >> device 30 osd.30
>> >> device 31 osd.31
>> >> device 32 osd.32
>> >> device 33 osd.33
>> >> device 34 osd.34
>> >> device 35 osd.35
>> >> device 36 osd.36
>> >> device 37 osd.37
>> >> device 38 osd.38
>> >> device 39 osd.39
>> >> device 40 osd.40
>> >> device 41 osd.41
>> >> device 42 osd.42
>> >> device 43 osd.43
>> >> device 44 osd.44
>> >> device 45 osd.45
>> >> device 46 osd.46
>> >> device 47 osd.47
>> >> device 48 osd.48
>> >> device 49 osd.49
>> >> device 50 osd.50
>> >> device 51 osd.51
>> >> device 52 osd.52
>> >> device 53 osd.53
>> >> device 54 osd.54
>> >> device 55 osd.55
>> >> device 56 osd.56
>> >> device 57 osd.57
>> >> device 58 osd.58
>> >> device 59 osd.59
>> >> device 60 osd.60
>> >> device 61 osd.61
>> >> device 62 osd.62
>> >> device 63 osd.63
>> >> device 64 osd.64
>> >> device 65 osd.65
>> >> device 66 osd.66
>> >> device 67 osd.67
>> >> device 68 osd.68
>> >> device 69 osd.69
>> >> device 70 osd.70
>> >> device 71 osd.71
>> >> device 72 osd.72
>> >> device 73 osd.73
>> >> device 74 osd.74
>> >> device 75 osd.75
>> >> device 76 osd.76
>> >> device 77 osd.77
>> >>
>> >> # types
>> >> type 0 osd
>> >> type 1 host
>> >> type 2 root
>> >>
>> >> # buckets
>> >> host ceph001-ssd {
>> >>         id -1           # do not change unnecessarily
>> >>         # weight 7.812
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.0 weight 3.906
>> >>         item osd.1 weight 3.906
>> >> }
>> >> host ceph002-ssd {
>> >>         id -2           # do not change unnecessarily
>> >>         # weight 7.812
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.26 weight 3.906
>> >>         item osd.27 weight 3.906
>> >> }
>> >> host ceph003-ssd {
>> >>         id -3           # do not change unnecessarily
>> >>         # weight 7.812
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.52 weight 3.906
>> >>         item osd.53 weight 3.906
>> >> }
>> >> root default-ssd {
>> >>         id -4           # do not change unnecessarily
>> >>         # weight 23.436
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item ceph001-ssd weight 7.812
>> >>         item ceph002-ssd weight 7.812
>> >>         item ceph003-ssd weight 7.812
>> >> }
>> >> host ceph001-ec {
>> >>         id -5           # do not change unnecessarily
>> >>         # weight 46.872
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.3 weight 3.906
>> >>         item osd.5 weight 3.906
>> >>         item osd.7 weight 3.906
>> >>         item osd.9 weight 3.906
>> >>         item osd.11 weight 3.906
>> >>         item osd.13 weight 3.906
>> >>         item osd.15 weight 3.906
>> >>         item osd.17 weight 3.906
>> >>         item osd.19 weight 3.906
>> >>         item osd.21 weight 3.906
>> >>         item osd.23 weight 3.906
>> >>         item osd.25 weight 3.906
>> >> }
>> >> host ceph002-ec {
>> >>         id -6           # do not change unnecessarily
>> >>         # weight 46.872
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.29 weight 3.906
>> >>         item osd.31 weight 3.906
>> >>         item osd.33 weight 3.906
>> >>         item osd.35 weight 3.906
>> >>         item osd.37 weight 3.906
>> >>         item osd.39 weight 3.906
>> >>         item osd.41 weight 3.906
>> >>         item osd.43 weight 3.906
>> >>         item osd.45 weight 3.906
>> >>         item osd.47 weight 3.906
>> >>         item osd.49 weight 3.906
>> >>         item osd.51 weight 3.906
>> >> }
>> >> host ceph003-ec {
>> >>         id -7           # do not change unnecessarily
>> >>         # weight 46.872
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.55 weight 3.906
>> >>         item osd.57 weight 3.906
>> >>         item osd.59 weight 3.906
>> >>         item osd.61 weight 3.906
>> >>         item osd.63 weight 3.906
>> >>         item osd.65 weight 3.906
>> >>         item osd.67 weight 3.906
>> >>         item osd.69 weight 3.906
>> >>         item osd.71 weight 3.906
>> >>         item osd.73 weight 3.906
>> >>         item osd.75 weight 3.906
>> >>         item osd.77 weight 3.906
>> >> }
>> >> root default-ec {
>> >>         id -8           # do not change unnecessarily
>> >>         # weight 140.616
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item ceph001-ec weight 46.872
>> >>         item ceph002-ec weight 46.872
>> >>         item ceph003-ec weight 46.872
>> >> }
>> >> host ceph001-cache {
>> >>         id -9           # do not change unnecessarily
>> >>         # weight 46.872
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.2 weight 3.906
>> >>         item osd.4 weight 3.906
>> >>         item osd.6 weight 3.906
>> >>         item osd.8 weight 3.906
>> >>         item osd.10 weight 3.906
>> >>         item osd.12 weight 3.906
>> >>         item osd.14 weight 3.906
>> >>         item osd.16 weight 3.906
>> >>         item osd.18 weight 3.906
>> >>         item osd.20 weight 3.906
>> >>         item osd.22 weight 3.906
>> >>         item osd.24 weight 3.906
>> >> }
>> >> host ceph002-cache {
>> >>         id -10          # do not change unnecessarily
>> >>         # weight 46.872
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.28 weight 3.906
>> >>         item osd.30 weight 3.906
>> >>         item osd.32 weight 3.906
>> >>         item osd.34 weight 3.906
>> >>         item osd.36 weight 3.906
>> >>         item osd.38 weight 3.906
>> >>         item osd.40 weight 3.906
>> >>         item osd.42 weight 3.906
>> >>         item osd.44 weight 3.906
>> >>         item osd.46 weight 3.906
>> >>         item osd.48 weight 3.906
>> >>         item osd.50 weight 3.906
>> >> }
>> >> host ceph003-cache {
>> >>         id -11          # do not change unnecessarily
>> >>         # weight 46.872
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.54 weight 3.906
>> >>         item osd.56 weight 3.906
>> >>         item osd.58 weight 3.906
>> >>         item osd.60 weight 3.906
>> >>         item osd.62 weight 3.906
>> >>         item osd.64 weight 3.906
>> >>         item osd.66 weight 3.906
>> >>         item osd.68 weight 3.906
>> >>         item osd.70 weight 3.906
>> >>         item osd.72 weight 3.906
>> >>         item osd.74 weight 3.906
>> >>         item osd.76 weight 3.906
>> >> }
>> >> root default-cache {
>> >>         id -12          # do not change unnecessarily
>> >>         # weight 140.616
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item ceph001-cache weight 46.872
>> >>         item ceph002-cache weight 46.872
>> >>         item ceph003-cache weight 46.872
>> >> }
>> >>
>> >> # rules
>> >> rule cache {
>> >>         ruleset 0
>> >>         type replicated
>> >>         min_size 1
>> >>         max_size 10
>> >>         step take default-cache
>> >>         step chooseleaf firstn 0 type host
>> >>         step emit
>> >> }
>> >> rule metadata {
>> >>         ruleset 1
>> >>         type replicated
>> >>         min_size 1
>> >>         max_size 10
>> >>         step take default-ssd
>> >>         step chooseleaf firstn 0 type host
>> >>         step emit
>> >> }
>> >> rule ecdata {
>> >>         ruleset 2
>> >>         type erasure
>> >>         min_size 3
>> >>         max_size 20
>> >>         step set_chooseleaf_tries 5
>> >>         step take default-ec
>> >>         step choose indep 0 type osd
>> >>         step emit
>> >> }
>> >>
>> >> # end crush map
>> >>
>> >> The benchmarks I then did:
>> >>
>> >> ./benchrw 50000
>> >>
>> >> benchrw:
>> >> /usr/bin/rados -p ecdata bench $1 write --no-cleanup
>> >> /usr/bin/rados -p ecdata bench $1 seq
>> >> /usr/bin/rados -p ecdata bench $1 seq &
>> >> /usr/bin/rados -p ecdata bench $1 write --no-cleanup
>> >>
>> >>
>> >> Srubbing errors started soon after that: 2014-08-31 10:59:14
>> >>
>> >>
>> >> Please let me know if you need more information, and thanks !
>> >>
>> >> Kenneth
>> >>
>> >> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>> >>    Date: Mon, 1 Sep 2014 21:30:16 +0800
>> >>    From: Haomai Wang <haomaiwang at gmail.com>
>> >> Subject: Re: ceph cluster inconsistency keyvaluestore
>> >>      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>> >>      Cc: ceph-users at lists.ceph.com
>> >>
>> >>
>> >>> Hmm, could you please list your instructions including cluster
>> existing
>> >>> time and all relevant ops? I want to reproduce it.
>> >>>
>> >>>
>> >>> On Mon, Sep 1, 2014 at 4:45 PM, Kenneth Waegeman
>> >>> <Kenneth.Waegeman at ugent.be>
>> >>> wrote:
>> >>>
>> >>>> Hi,
>> >>>>
>> >>>> I reinstalled the cluster with 0.84, and tried again running rados
>> bench
>> >>>> on a EC coded pool on keyvaluestore.
>> >>>> Nothing crashed this time, but when I check the status:
>> >>>>
>> >>>>     health HEALTH_ERR 128 pgs inconsistent; 128 scrub errors; too few
>> >>>> pgs
>> >>>> per osd (15 < min 20)
>> >>>>     monmap e1: 3 mons at {ceph001=10.141.8.180:6789/0,
>> >>>> ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, election
>> epoch
>> >>>> 8, quorum 0,1,2 ceph001,ceph002,ceph003
>> >>>>     osdmap e174: 78 osds: 78 up, 78 in
>> >>>>      pgmap v147680: 1216 pgs, 3 pools, 14758 GB data, 3690 kobjects
>> >>>>            1753 GB used, 129 TB / 131 TB avail
>> >>>>                1088 active+clean
>> >>>>                 128 active+clean+inconsistent
>> >>>>
>> >>>> the 128 inconsistent pgs are ALL the pgs of the EC KV store ( the
>> others
>> >>>> are on Filestore)
>> >>>>
>> >>>> The only thing I can see in the logs is that after the rados tests,
>> it
>> >>>> start scrubbing, and for each KV pg I get something like this:
>> >>>>
>> >>>> 2014-08-31 11:14:09.050747 osd.11 10.141.8.180:6833/61098 4 : [ERR]
>> >>>> 2.3s0
>> >>>> scrub stat mismatch, got 28164/29291 objects, 0/0 clones,
>> 28164/29291
>> >>>> dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts,
>> >>>> 118128377856/122855358464 bytes.
>> >>>>
>> >>>> What could here be the problem?
>> >>>> Thanks again!!
>> >>>>
>> >>>> Kenneth
>> >>>>
>> >>>>
>> >>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>> >>>>   Date: Tue, 26 Aug 2014 17:11:43 +0800
>> >>>>   From: Haomai Wang <haomaiwang at gmail.com>
>> >>>> Subject: Re: ceph cluster inconsistency?
>> >>>>     To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>> >>>>     Cc: ceph-users at lists.ceph.com
>> >>>>
>> >>>>
>> >>>> Hmm, it looks like you hit this
>> >>>> bug(http://tracker.ceph.com/issues/9223).
>> >>>>>
>> >>>>>
>> >>>>> Sorry for the late message, I forget that this fix is merged into
>> 0.84.
>> >>>>>
>> >>>>> Thanks for your patient :-)
>> >>>>>
>> >>>>> On Tue, Aug 26, 2014 at 4:39 PM, Kenneth Waegeman
>> >>>>> <Kenneth.Waegeman at ugent.be> wrote:
>> >>>>>
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> In the meantime I already tried with upgrading the cluster to
>> 0.84, to
>> >>>>>> see
>> >>>>>> if that made a difference, and it seems it does.
>> >>>>>> I can't reproduce the crashing osds by doing a 'rados -p ecdata ls'
>> >>>>>> anymore.
>> >>>>>>
>> >>>>>> But now the cluster detect it is inconsistent:
>> >>>>>>
>> >>>>>>      cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>> >>>>>>       health HEALTH_ERR 40 pgs inconsistent; 40 scrub errors; too
>> few
>> >>>>>> pgs
>> >>>>>> per osd (4 < min 20); mon.ceph002 low disk space
>> >>>>>>       monmap e3: 3 mons at
>> >>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,
>> >>>>>> ceph003=10.141.8.182:6789/0},
>> >>>>>> election epoch 30, quorum 0,1,2 ceph001,ceph002,ceph003
>> >>>>>>       mdsmap e78951: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3
>> >>>>>> up:standby
>> >>>>>>       osdmap e145384: 78 osds: 78 up, 78 in
>> >>>>>>        pgmap v247095: 320 pgs, 4 pools, 15366 GB data, 3841
>> kobjects
>> >>>>>>              1502 GB used, 129 TB / 131 TB avail
>> >>>>>>                   279 active+clean
>> >>>>>>                    40 active+clean+inconsistent
>> >>>>>>                     1 active+clean+scrubbing+deep
>> >>>>>>
>> >>>>>>
>> >>>>>> I tried to do ceph pg repair for all the inconsistent pgs:
>> >>>>>>
>> >>>>>>      cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>> >>>>>>       health HEALTH_ERR 40 pgs inconsistent; 1 pgs repair; 40 scrub
>> >>>>>> errors;
>> >>>>>> too few pgs per osd (4 < min 20); mon.ceph002 low disk space
>> >>>>>>       monmap e3: 3 mons at
>> >>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,
>> >>>>>> ceph003=10.141.8.182:6789/0},
>> >>>>>> election epoch 30, quorum 0,1,2 ceph001,ceph002,ceph003
>> >>>>>>       mdsmap e79486: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3
>> >>>>>> up:standby
>> >>>>>>       osdmap e146452: 78 osds: 78 up, 78 in
>> >>>>>>        pgmap v248520: 320 pgs, 4 pools, 15366 GB data, 3841
>> kobjects
>> >>>>>>              1503 GB used, 129 TB / 131 TB avail
>> >>>>>>                   279 active+clean
>> >>>>>>                    39 active+clean+inconsistent
>> >>>>>>                     1 active+clean+scrubbing+deep
>> >>>>>>                     1
>> active+clean+scrubbing+deep+inconsistent+repair
>> >>>>>>
>> >>>>>> I let it recovering through the night, but this morning the mons
>> were
>> >>>>>> all
>> >>>>>> gone, nothing to see in the log files.. The osds were all still up!
>> >>>>>>
>> >>>>>>    cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>> >>>>>>     health HEALTH_ERR 36 pgs inconsistent; 1 pgs repair; 36 scrub
>> >>>>>> errors;
>> >>>>>> too few pgs per osd (4 < min 20)
>> >>>>>>     monmap e7: 3 mons at
>> >>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,
>> >>>>>> ceph003=10.141.8.182:6789/0},
>> >>>>>> election epoch 44, quorum 0,1,2 ceph001,ceph002,ceph003
>> >>>>>>     mdsmap e109481: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3
>> >>>>>> up:standby
>> >>>>>>     osdmap e203410: 78 osds: 78 up, 78 in
>> >>>>>>      pgmap v331747: 320 pgs, 4 pools, 15251 GB data, 3812 kobjects
>> >>>>>>            1547 GB used, 129 TB / 131 TB avail
>> >>>>>>                   1 active+clean+scrubbing+deep+inconsistent+repair
>> >>>>>>                 284 active+clean
>> >>>>>>                  35 active+clean+inconsistent
>> >>>>>>
>> >>>>>> I restarted the monitors now, I will let you know when I see
>> something
>> >>>>>> more..
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>> >>>>>>     Date: Sun, 24 Aug 2014 12:51:41 +0800
>> >>>>>>
>> >>>>>>     From: Haomai Wang <haomaiwang at gmail.com>
>> >>>>>> Subject: Re: ceph cluster inconsistency?
>> >>>>>>       To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>,
>> >>>>>> ceph-users at lists.ceph.com
>> >>>>>>
>> >>>>>>
>> >>>>>> It's really strange! I write a test program according the key
>> ordering
>> >>>>>>>
>> >>>>>>> you provided and parse the corresponding value. It's true!
>> >>>>>>>
>> >>>>>>> I have no idea now. If free, could you add this debug code to
>> >>>>>>> "src/os/GenericObjectMap.cc" and insert *before* "assert(start <=
>> >>>>>>> header.oid);":
>> >>>>>>>
>> >>>>>>>  dout(0) << "start: " << start << "header.oid: " << header.oid <<
>> >>>>>>> dendl;
>> >>>>>>>
>> >>>>>>> Then you need to recompile ceph-osd and run it again. The output
>> log
>> >>>>>>> can help it!
>> >>>>>>>
>> >>>>>>> On Tue, Aug 19, 2014 at 10:19 PM, Haomai Wang <
>> haomaiwang at gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>>
>> >>>>>>>> I feel a little embarrassed, 1024 rows still true for me.
>> >>>>>>>>
>> >>>>>>>> I was wondering if you could give your all keys via
>> >>>>>>>> ""ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list
>> >>>>>>>> _GHOBJTOSEQ_ > keys.log?.
>> >>>>>>>>
>> >>>>>>>> thanks!
>> >>>>>>>>
>> >>>>>>>> On Tue, Aug 19, 2014 at 4:58 PM, Kenneth Waegeman
>> >>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>> >>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>> >>>>>>>>> Date: Tue, 19 Aug 2014 12:28:27 +0800
>> >>>>>>>>>
>> >>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com>
>> >>>>>>>>> Subject: Re: ceph cluster inconsistency?
>> >>>>>>>>>   To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>> >>>>>>>>>   Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Mon, Aug 18, 2014 at 7:32 PM, Kenneth Waegeman
>> >>>>>>>>>>
>> >>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com>
>> ---------
>> >>>>>>>>>>> Date: Mon, 18 Aug 2014 18:34:11 +0800
>> >>>>>>>>>>>
>> >>>>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com>
>> >>>>>>>>>>> Subject: Re: ceph cluster inconsistency?
>> >>>>>>>>>>>   To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>> >>>>>>>>>>>   Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Aug 18, 2014 at 5:38 PM, Kenneth Waegeman
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Hi,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I tried this after restarting the osd, but I guess that was
>> not
>> >>>>>>>>>>>>> the
>> >>>>>>>>>>>>> aim
>> >>>>>>>>>>>>> (
>> >>>>>>>>>>>>> # ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list
>> >>>>>>>>>>>>> _GHOBJTOSEQ_|
>> >>>>>>>>>>>>> grep 6adb1100 -A 100
>> >>>>>>>>>>>>> IO error: lock /var/lib/ceph/osd/ceph-67/current//LOCK:
>> >>>>>>>>>>>>> Resource
>> >>>>>>>>>>>>> temporarily
>> >>>>>>>>>>>>> unavailable
>> >>>>>>>>>>>>> tools/ceph_kvstore_tool.cc: In function
>> >>>>>>>>>>>>> 'StoreTool::StoreTool(const
>> >>>>>>>>>>>>> string&)' thread 7f8fecf7d780 time 2014-08-18
>> 11:12:29.551780
>> >>>>>>>>>>>>> tools/ceph_kvstore_tool.cc: 38: FAILED
>> >>>>>>>>>>>>> assert(!db_ptr->open(std::cerr))
>> >>>>>>>>>>>>> ..
>> >>>>>>>>>>>>> )
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> When I run it after bringing the osd down, it takes a while,
>> >>>>>>>>>>>>> but
>> >>>>>>>>>>>>> it
>> >>>>>>>>>>>>> has
>> >>>>>>>>>>>>> no
>> >>>>>>>>>>>>> output.. (When running it without the grep, I'm getting a
>> huge
>> >>>>>>>>>>>>> list
>> >>>>>>>>>>>>> )
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Oh, sorry for it! I made a mistake, the hash value(6adb1100)
>> >>>>>>>>>>>> will
>> >>>>>>>>>>>> be
>> >>>>>>>>>>>> reversed into leveldb.
>> >>>>>>>>>>>> So grep "benchmark_data_ceph001.cubone.os_5560_object789734"
>> >>>>>>>>>>>> should
>> >>>>>>>>>>>> be
>> >>>>>>>>>>>> help it.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> this gives:
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> [root at ceph003 ~]# ceph-kvstore-tool
>> /var/lib/ceph/osd/ceph-67/
>> >>>>>>>>>>> current/
>> >>>>>>>>>>> list
>> >>>>>>>>>>> _GHOBJTOSEQ_ | grep 5560_object789734 -A 100
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011BDA6!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object789734!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011C027!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1330170!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011C6FD!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object227366!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011CB03!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1363631!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011CDF0!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1573957!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011D02C!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1019282!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E2B5!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1283563!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E511!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object273736!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E547!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1170628!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011EAAB!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object256335!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011F446!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1484196!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011FC59!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object884178!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001203F3!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object853746!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001208E3!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object36633!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00120B37!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1235337!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001210B6!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1661351!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001210CB!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object238126!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012184C!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object339943!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00121916!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1047094!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001219C1!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object520642!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001222BB!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object639565!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001223AA!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object231080!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012243C!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object858050!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012289C!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object241796!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122D28!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object7462!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122DFE!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object243798!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122EFC!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_8961_object109512!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001232D7!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object653973!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001234A3!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1378169!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123714!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object512925!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001237D9!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object23289!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123854!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1108852!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123971!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object704026!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123F75!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_8961_object250441!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124083!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object706178!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001240FA!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object316952!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012447D!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object538734!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001244D9!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object789215!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001247CD!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_8961_object265993!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124897!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object610597!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124BE4!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object691723!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124C9B!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1306135!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124E1D!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object520580!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012534C!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object659767!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00125A81!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object184060!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00125E77!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1292867!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00126562!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1201410!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00126B34!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1657326!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127383!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1269787!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127396!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object500115!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001277F8!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object394932!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001279DD!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object252963!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127B40!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object936811!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127BAC!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1481773!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012894E!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object999885!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00128D05!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object943667!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012908A!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object212990!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129519!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object437596!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129716!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1585330!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129798!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object603505!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001299C9!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object808800!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129B7A!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object23193!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129B9A!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1158397!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012A932!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object542450!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012B77A!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_8961_object195480!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012BE8C!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object312911!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012BF74!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1563783!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012C65C!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1123980!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012C6FE!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_3411_object913!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012CCAD!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object400863!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012CDBB!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object789667!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012D14B!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1020723!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012D95B!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_8961_object106293!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012E3C8!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1355526!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012E5B3!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1491348!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012F2BB!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_8961_object338872!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012F374!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1337264!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FBE5!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1512395!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FCE3!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_8961_object298610!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FEB6!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object120824!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001301CA!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object816326!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00130263!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object777163!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00130529!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1413173!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001317D9!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object809510!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0013204F!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object471416!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132400!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object695087!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132A19!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object591945!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132BF8!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object302000!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132F5B!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1645443!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00133B8B!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object761911!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0013433E!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object1467727!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134446!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object791960!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134678!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object677078!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134A96!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object254923!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001355D0!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_31461_object321528!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135690!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4919_object36935!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135B62!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object1228272!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135C72!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_4812_object2180!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135DEE!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object425705!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00136366!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object141569!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00136371!!3!!benchmark_data_
>> >>>>>>>>>>> ceph001%ecubone%eos_5560_object564213!head
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>> 100 rows seemed true for me. I found the min list objects is
>> 1024.
>> >>>>>>>>>> Please could you run
>> >>>>>>>>>> "ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list
>> >>>>>>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 1024"
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> I got the output in attachment
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>>>> Or should I run this immediately after the osd is crashed,
>> >>>>>>>>>>>>> (because
>> >>>>>>>>>>>>> it
>> >>>>>>>>>>>>> maybe
>> >>>>>>>>>>>>> rebalanced?  I did already restarted the cluster)
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I don't know if it is related, but before I could all do
>> that,
>> >>>>>>>>>>>>> I
>> >>>>>>>>>>>>> had
>> >>>>>>>>>>>>> to
>> >>>>>>>>>>>>> fix
>> >>>>>>>>>>>>> something else: A monitor did run out if disk space, using
>> 8GB
>> >>>>>>>>>>>>> for
>> >>>>>>>>>>>>> his
>> >>>>>>>>>>>>> store.db folder (lot of sst files). Other monitors are also
>> >>>>>>>>>>>>> near
>> >>>>>>>>>>>>> that
>> >>>>>>>>>>>>> level.
>> >>>>>>>>>>>>> Never had that problem on previous setups before. I
>> recreated a
>> >>>>>>>>>>>>> monitor
>> >>>>>>>>>>>>> and
>> >>>>>>>>>>>>> now it uses 3.8GB.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> It exists some duplicate data which needed to be compacted.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>> Another idea, maybe you can make KeyValueStore's stripe size
>> >>>>>>>>>>>> align
>> >>>>>>>>>>>> with EC stripe size.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> How can I do that? Is there some documentation about that?
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> ceph --show-config | grep keyvaluestore
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> debug_keyvaluestore = 0/0
>> >>>>>>>>>> keyvaluestore_queue_max_ops = 50
>> >>>>>>>>>> keyvaluestore_queue_max_bytes = 104857600
>> >>>>>>>>>> keyvaluestore_debug_check_backend = false
>> >>>>>>>>>> keyvaluestore_op_threads = 2
>> >>>>>>>>>> keyvaluestore_op_thread_timeout = 60
>> >>>>>>>>>> keyvaluestore_op_thread_suicide_timeout = 180
>> >>>>>>>>>> keyvaluestore_default_strip_size = 4096
>> >>>>>>>>>> keyvaluestore_max_expected_write_size = 16777216
>> >>>>>>>>>> keyvaluestore_header_cache_size = 4096
>> >>>>>>>>>> keyvaluestore_backend = leveldb
>> >>>>>>>>>>
>> >>>>>>>>>> keyvaluestore_default_strip_size is the wanted
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> I haven't think deeply and maybe I will try it later.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Thanks!
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Kenneth
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> ----- Message from Sage Weil <sweil at redhat.com> ---------
>> >>>>>>>>>>>>> Date: Fri, 15 Aug 2014 06:10:34 -0700 (PDT)
>> >>>>>>>>>>>>> From: Sage Weil <sweil at redhat.com>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency?
>> >>>>>>>>>>>>>   To: Haomai Wang <haomaiwang at gmail.com>
>> >>>>>>>>>>>>>   Cc: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>,
>> >>>>>>>>>>>>> ceph-users at lists.ceph.com
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Fri, 15 Aug 2014, Haomai Wang wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Hi Kenneth,
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> I don't find valuable info in your logs, it lack of the
>> >>>>>>>>>>>>>>> necessary
>> >>>>>>>>>>>>>>> debug output when accessing crash code.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> But I scan the encode/decode implementation in
>> >>>>>>>>>>>>>>> GenericObjectMap
>> >>>>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>> find something bad.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> For example, two oid has same hash and their name is:
>> >>>>>>>>>>>>>>> A: "rb.data.123"
>> >>>>>>>>>>>>>>> B: "rb-123"
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> In ghobject_t compare level, A < B. But GenericObjectMap
>> >>>>>>>>>>>>>>> encode
>> >>>>>>>>>>>>>>> "."
>> >>>>>>>>>>>>>>> to
>> >>>>>>>>>>>>>>> "%e", so the key in DB is:
>> >>>>>>>>>>>>>>> A: _GHOBJTOSEQ_:blah!51615000!!none!!rb%edata%e123!head
>> >>>>>>>>>>>>>>> B: _GHOBJTOSEQ_:blah!51615000!!none!!rb-123!head
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> A > B
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> And it seemed that the escape function is useless and
>> should
>> >>>>>>>>>>>>>>> be
>> >>>>>>>>>>>>>>> disabled.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> I'm not sure whether Kenneth's problem is touching this
>> bug.
>> >>>>>>>>>>>>>>> Because
>> >>>>>>>>>>>>>>> this scene only occur when the object set is very large
>> and
>> >>>>>>>>>>>>>>> make
>> >>>>>>>>>>>>>>> the
>> >>>>>>>>>>>>>>> two object has same hash value.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Kenneth, could you have time to run "ceph-kv-store
>> >>>>>>>>>>>>>>> [path-to-osd]
>> >>>>>>>>>>>>>>> list
>> >>>>>>>>>>>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 100". ceph-kv-store is a
>> debug
>> >>>>>>>>>>>>>>> tool
>> >>>>>>>>>>>>>>> which can be compiled from source. You can clone ceph repo
>> >>>>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>> run
>> >>>>>>>>>>>>>>> "./authongen.sh; ./configure; cd src; make
>> >>>>>>>>>>>>>>> ceph-kvstore-tool".
>> >>>>>>>>>>>>>>> "path-to-osd" should be "/var/lib/ceph/osd-[id]/current/".
>> >>>>>>>>>>>>>>> "6adb1100"
>> >>>>>>>>>>>>>>> is from your verbose log and the next 100 rows should know
>> >>>>>>>>>>>>>>> necessary
>> >>>>>>>>>>>>>>> infos.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> You can also get ceph-kvstore-tool from the 'ceph-tests'
>> >>>>>>>>>>>>>> package.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Hi sage, do you think we need to provided with upgrade
>> >>>>>>>>>>>>>> function
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> to
>> >>>>>>>>>>>>>>> fix
>> >>>>>>>>>>>>>>> it?
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Hmm, we might.  This only affects the key/value encoding
>> >>>>>>>>>>>>>> right?
>> >>>>>>>>>>>>>> The
>> >>>>>>>>>>>>>> FileStore is using its own function to map these to file
>> >>>>>>>>>>>>>> names?
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Can you open a ticket in the tracker for this?
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Thanks!
>> >>>>>>>>>>>>>> sage
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 7:36 PM, Kenneth Waegeman
>> >>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com>
>> >>>>>>>>>>>>>>>> ---------
>> >>>>>>>>>>>>>>>>  Date: Thu, 14 Aug 2014 19:11:55 +0800
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>  From: Haomai Wang <haomaiwang at gmail.com>
>> >>>>>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency?
>> >>>>>>>>>>>>>>>>    To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Could you add config "debug_keyvaluestore = 20/20" to the
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> crashed
>> >>>>>>>>>>>>>>>>> osd
>> >>>>>>>>>>>>>>>>> and replay the command causing crash?
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> I would like to get more debug infos! Thanks.
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> I included the log in attachment!
>> >>>>>>>>>>>>>>>> Thanks!
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 4:41 PM, Kenneth Waegeman
>> >>>>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I have:
>> >>>>>>>>>>>>>>>>>> osd_objectstore = keyvaluestore-dev
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> in the global section of my ceph.conf
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> [root at ceph002 ~]# ceph osd erasure-code-profile get
>> >>>>>>>>>>>>>>>>>> profile11
>> >>>>>>>>>>>>>>>>>> directory=/usr/lib64/ceph/erasure-code
>> >>>>>>>>>>>>>>>>>> k=8
>> >>>>>>>>>>>>>>>>>> m=3
>> >>>>>>>>>>>>>>>>>> plugin=jerasure
>> >>>>>>>>>>>>>>>>>> ruleset-failure-domain=osd
>> >>>>>>>>>>>>>>>>>> technique=reed_sol_van
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> the ecdata pool has this as profile
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> pool 3 'ecdata' erasure size 11 min_size 8
>> crush_ruleset 2
>> >>>>>>>>>>>>>>>>>> object_hash
>> >>>>>>>>>>>>>>>>>> rjenkins pg_num 128 pgp_num 128 last_change 161 flags
>> >>>>>>>>>>>>>>>>>> hashpspool
>> >>>>>>>>>>>>>>>>>> stripe_width 4096
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> ECrule in crushmap
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> rule ecdata {
>> >>>>>>>>>>>>>>>>>>       ruleset 2
>> >>>>>>>>>>>>>>>>>>       type erasure
>> >>>>>>>>>>>>>>>>>>       min_size 3
>> >>>>>>>>>>>>>>>>>>       max_size 20
>> >>>>>>>>>>>>>>>>>>       step set_chooseleaf_tries 5
>> >>>>>>>>>>>>>>>>>>       step take default-ec
>> >>>>>>>>>>>>>>>>>>       step choose indep 0 type osd
>> >>>>>>>>>>>>>>>>>>       step emit
>> >>>>>>>>>>>>>>>>>> }
>> >>>>>>>>>>>>>>>>>> root default-ec {
>> >>>>>>>>>>>>>>>>>>       id -8           # do not change unnecessarily
>> >>>>>>>>>>>>>>>>>>       # weight 140.616
>> >>>>>>>>>>>>>>>>>>       alg straw
>> >>>>>>>>>>>>>>>>>>       hash 0  # rjenkins1
>> >>>>>>>>>>>>>>>>>>       item ceph001-ec weight 46.872
>> >>>>>>>>>>>>>>>>>>       item ceph002-ec weight 46.872
>> >>>>>>>>>>>>>>>>>>       item ceph003-ec weight 46.872
>> >>>>>>>>>>>>>>>>>> ...
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Cheers!
>> >>>>>>>>>>>>>>>>>> Kenneth
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com>
>> >>>>>>>>>>>>>>>>>> ---------
>> >>>>>>>>>>>>>>>>>>  Date: Thu, 14 Aug 2014 10:07:50 +0800
>> >>>>>>>>>>>>>>>>>>  From: Haomai Wang <haomaiwang at gmail.com>
>> >>>>>>>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency?
>> >>>>>>>>>>>>>>>>>>    To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>> >>>>>>>>>>>>>>>>>>    Cc: ceph-users <ceph-users at lists.ceph.com>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Hi Kenneth,
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> Could you give your configuration related to EC and
>> >>>>>>>>>>>>>>>>>>> KeyValueStore?
>> >>>>>>>>>>>>>>>>>>> Not sure whether it's bug on KeyValueStore
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 12:06 AM, Kenneth Waegeman
>> >>>>>>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> Hi,
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> I was doing some tests with rados bench on a Erasure
>> >>>>>>>>>>>>>>>>>>>> Coded
>> >>>>>>>>>>>>>>>>>>>> pool
>> >>>>>>>>>>>>>>>>>>>> (using
>> >>>>>>>>>>>>>>>>>>>> keyvaluestore-dev objectstore) on 0.83, and I see
>> some
>> >>>>>>>>>>>>>>>>>>>> strangs
>> >>>>>>>>>>>>>>>>>>>> things:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> [root at ceph001 ~]# ceph status
>> >>>>>>>>>>>>>>>>>>>>   cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>> >>>>>>>>>>>>>>>>>>>>    health HEALTH_WARN too few pgs per osd (4 < min
>> 20)
>> >>>>>>>>>>>>>>>>>>>>    monmap e1: 3 mons at
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> {ceph001=
>> 10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,
>> >>>>>>>>>>>>>>>>>>>> ceph003=10.141.8.182:6789/0},
>> >>>>>>>>>>>>>>>>>>>> election epoch 6, quorum 0,1,2
>> ceph001,ceph002,ceph003
>> >>>>>>>>>>>>>>>>>>>>    mdsmap e116: 1/1/1 up
>> >>>>>>>>>>>>>>>>>>>> {0=ceph001.cubone.os=up:active},
>> >>>>>>>>>>>>>>>>>>>> 2
>> >>>>>>>>>>>>>>>>>>>> up:standby
>> >>>>>>>>>>>>>>>>>>>>    osdmap e292: 78 osds: 78 up, 78 in
>> >>>>>>>>>>>>>>>>>>>>     pgmap v48873: 320 pgs, 4 pools, 15366 GB data,
>> 3841
>> >>>>>>>>>>>>>>>>>>>> kobjects
>> >>>>>>>>>>>>>>>>>>>>           1381 GB used, 129 TB / 131 TB avail
>> >>>>>>>>>>>>>>>>>>>>                320 active+clean
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> There is around 15T of data, but only 1.3 T usage.
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> This is also visible in rados:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> [root at ceph001 ~]# rados df
>> >>>>>>>>>>>>>>>>>>>> pool name       category                 KB
>> objects
>> >>>>>>>>>>>>>>>>>>>> clones
>> >>>>>>>>>>>>>>>>>>>> degraded      unfound           rd        rd KB
>> >>>>>>>>>>>>>>>>>>>> wr
>> >>>>>>>>>>>>>>>>>>>> wr
>> >>>>>>>>>>>>>>>>>>>> KB
>> >>>>>>>>>>>>>>>>>>>> data            -                          0
>> >>>>>>>>>>>>>>>>>>>> 0
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> 0           0            0            0            0
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> ecdata          -                16113451009
>> >>>>>>>>>>>>>>>>>>>> 3933959
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> 0           0            1            1      3935632
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> 16116850711
>> >>>>>>>>>>>>>>>>>>>> metadata        -                          2
>> >>>>>>>>>>>>>>>>>>>> 20
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> 0           0           33           36           21
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> 8
>> >>>>>>>>>>>>>>>>>>>> rbd             -                          0
>> >>>>>>>>>>>>>>>>>>>> 0
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> 0           0            0            0            0
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> total used      1448266016      3933979
>> >>>>>>>>>>>>>>>>>>>> total avail   139400181016
>> >>>>>>>>>>>>>>>>>>>> total space   140848447032
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> Another (related?) thing: if I do rados -p ecdata
>> ls, I
>> >>>>>>>>>>>>>>>>>>>> trigger
>> >>>>>>>>>>>>>>>>>>>> osd
>> >>>>>>>>>>>>>>>>>>>> shutdowns (each time):
>> >>>>>>>>>>>>>>>>>>>> I get a list followed by an error:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> ...
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_8961_object243839
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object801983
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object856489
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_8961_object202232
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_4919_object33199
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object807797
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_4919_object74729
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object1264121
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object1318513
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object1202111
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object939107
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object729682
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object122915
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object76521
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object113261
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object575079
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object671042
>> >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object381146
>> >>>>>>>>>>>>>>>>>>>> 2014-08-13 17:57:48.736150 7f65047b5700  0 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1023295 >>
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/4471 pipe(0x7f64fc019b20 sd=5 :0
>> s=1
>> >>>>>>>>>>>>>>>>>>>> pgs=0
>> >>>>>>>>>>>>>>>>>>>> cs=0
>> >>>>>>>>>>>>>>>>>>>> l=1
>> >>>>>>>>>>>>>>>>>>>> c=0x7f64fc019db0).fault
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> And I can see this in the log files:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>  -25> 2014-08-13 17:52:56.323908 7f8a97fa4700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 <== osd.57
>> 10.141.8.182:0/15796
>> >>>>>>>>>>>>>>>>>>>> 51
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092)
>> v2
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> 47+0+0
>> >>>>>>>>>>>>>>>>>>>> (3227325175 0 0) 0xf475940 con 0xee89fa0
>> >>>>>>>>>>>>>>>>>>>>  -24> 2014-08-13 17:52:56.323938 7f8a97fa4700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 --> 10.141.8.182:0/15796 --
>> >>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply
>> >>>>>>>>>>>>>>>>>>>> e220
>> >>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf815b00
>> >>>>>>>>>>>>>>>>>>>> con
>> >>>>>>>>>>>>>>>>>>>> 0xee89fa0
>> >>>>>>>>>>>>>>>>>>>>  -23> 2014-08-13 17:52:56.324078 7f8a997a7700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 <== osd.57
>> 10.141.8.182:0/15796
>> >>>>>>>>>>>>>>>>>>>> 51
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092)
>> v2
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> 47+0+0
>> >>>>>>>>>>>>>>>>>>>> (3227325175 0 0) 0xf132bc0 con 0xee8a680
>> >>>>>>>>>>>>>>>>>>>>  -22> 2014-08-13 17:52:56.324111 7f8a997a7700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 --> 10.141.8.182:0/15796 --
>> >>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply
>> >>>>>>>>>>>>>>>>>>>> e220
>> >>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf811a40
>> >>>>>>>>>>>>>>>>>>>> con
>> >>>>>>>>>>>>>>>>>>>> 0xee8a680
>> >>>>>>>>>>>>>>>>>>>>  -21> 2014-08-13 17:52:56.584461 7f8a997a7700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 <== osd.29
>> 10.143.8.181:0/12142
>> >>>>>>>>>>>>>>>>>>>> 47
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010)
>> v2
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> 47+0+0
>> >>>>>>>>>>>>>>>>>>>> (3355887204 0 0) 0xf655940 con 0xee88b00
>> >>>>>>>>>>>>>>>>>>>>  -20> 2014-08-13 17:52:56.584486 7f8a997a7700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 --> 10.143.8.181:0/12142 --
>> >>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply
>> >>>>>>>>>>>>>>>>>>>> e220
>> >>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf132bc0
>> >>>>>>>>>>>>>>>>>>>> con
>> >>>>>>>>>>>>>>>>>>>> 0xee88b00
>> >>>>>>>>>>>>>>>>>>>>  -19> 2014-08-13 17:52:56.584498 7f8a97fa4700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 <== osd.29
>> 10.143.8.181:0/12142
>> >>>>>>>>>>>>>>>>>>>> 47
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010)
>> v2
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> 47+0+0
>> >>>>>>>>>>>>>>>>>>>> (3355887204 0 0) 0xf20e040 con 0xee886e0
>> >>>>>>>>>>>>>>>>>>>>  -18> 2014-08-13 17:52:56.584526 7f8a97fa4700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 --> 10.143.8.181:0/12142 --
>> >>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply
>> >>>>>>>>>>>>>>>>>>>> e220
>> >>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf475940
>> >>>>>>>>>>>>>>>>>>>> con
>> >>>>>>>>>>>>>>>>>>>> 0xee886e0
>> >>>>>>>>>>>>>>>>>>>>  -17> 2014-08-13 17:52:56.594448 7f8a798c7700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 >> :/0 pipe(0xec15f00 sd=74
>> >>>>>>>>>>>>>>>>>>>> :6839
>> >>>>>>>>>>>>>>>>>>>> s=0
>> >>>>>>>>>>>>>>>>>>>> pgs=0
>> >>>>>>>>>>>>>>>>>>>> cs=0
>> >>>>>>>>>>>>>>>>>>>> l=0
>> >>>>>>>>>>>>>>>>>>>> c=0xee856a0).accept sd=74 10.141.8.180:47641/0
>> >>>>>>>>>>>>>>>>>>>>  -16> 2014-08-13 17:52:56.594921 7f8a798c7700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 <== client.7512
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1018433
>> >>>>>>>>>>>>>>>>>>>> 1
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+39
>> >>>>>>>>>>>>>>>>>>>> (1972163119
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> 4174233976) 0xf3bca40 con 0xee856a0
>> >>>>>>>>>>>>>>>>>>>>  -15> 2014-08-13 17:52:56.594957 7f8a798c7700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594874, event:
>> >>>>>>>>>>>>>>>>>>>> header_read,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>  -14> 2014-08-13 17:52:56.594970 7f8a798c7700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594880, event:
>> throttled,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>  -13> 2014-08-13 17:52:56.594978 7f8a798c7700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594917, event:
>> all_read,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>  -12> 2014-08-13 17:52:56.594986 7f8a798c7700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 299, time: 0.000000, event: dispatched, op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1
>> >>>>>>>>>>>>>>>>>>>> [pgls
>> >>>>>>>>>>>>>>>>>>>> start_epoch 0] 3.0 ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>  -11> 2014-08-13 17:52:56.595127 7f8a90795700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.595104, event:
>> >>>>>>>>>>>>>>>>>>>> reached_pg,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>  -10> 2014-08-13 17:52:56.595159 7f8a90795700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.595153, event:
>> started,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>   -9> 2014-08-13 17:52:56.602179 7f8a90795700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 --> 10.141.8.180:0/1018433
>> --
>> >>>>>>>>>>>>>>>>>>>> osd_op_reply(1
>> >>>>>>>>>>>>>>>>>>>> [pgls
>> >>>>>>>>>>>>>>>>>>>> start_epoch 0] v164'30654 uv30654 ondisk = 0) v6 --
>> ?+0
>> >>>>>>>>>>>>>>>>>>>> 0xec16180
>> >>>>>>>>>>>>>>>>>>>> con
>> >>>>>>>>>>>>>>>>>>>> 0xee856a0
>> >>>>>>>>>>>>>>>>>>>>   -8> 2014-08-13 17:52:56.602211 7f8a90795700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.602205, event: done,
>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>   -7> 2014-08-13 17:52:56.614839 7f8a798c7700  1 --
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 <== client.7512
>> >>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1018433
>> >>>>>>>>>>>>>>>>>>>> 2
>> >>>>>>>>>>>>>>>>>>>> ====
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+89
>> >>>>>>>>>>>>>>>>>>>> (3460833343
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> 2600845095) 0xf3bcec0 con 0xee856a0
>> >>>>>>>>>>>>>>>>>>>>   -6> 2014-08-13 17:52:56.614864 7f8a798c7700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614789, event:
>> >>>>>>>>>>>>>>>>>>>> header_read,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>   -5> 2014-08-13 17:52:56.614874 7f8a798c7700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614792, event:
>> throttled,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>   -4> 2014-08-13 17:52:56.614884 7f8a798c7700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614835, event:
>> all_read,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>   -3> 2014-08-13 17:52:56.614891 7f8a798c7700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 300, time: 0.000000, event: dispatched, op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2
>> >>>>>>>>>>>>>>>>>>>> [pgls
>> >>>>>>>>>>>>>>>>>>>> start_epoch 220] 3.0 ack+read+known_if_redirected
>> e220)
>> >>>>>>>>>>>>>>>>>>>>   -2> 2014-08-13 17:52:56.614972 7f8a92f9a700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614958, event:
>> >>>>>>>>>>>>>>>>>>>> reached_pg,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>   -1> 2014-08-13 17:52:56.614993 7f8a92f9a700  5 --
>> op
>> >>>>>>>>>>>>>>>>>>>> tracker
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> ,
>> >>>>>>>>>>>>>>>>>>>> seq:
>> >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614986, event:
>> started,
>> >>>>>>>>>>>>>>>>>>>> op:
>> >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>> >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>> >>>>>>>>>>>>>>>>>>>>    0> 2014-08-13 17:52:56.617087 7f8a92f9a700 -1
>> >>>>>>>>>>>>>>>>>>>> os/GenericObjectMap.cc:
>> >>>>>>>>>>>>>>>>>>>> In function 'int GenericObjectMap::list_objects(const
>> >>>>>>>>>>>>>>>>>>>> coll_t&,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> int, std::vector<ghobject_t>*, ghobject_t*)' thread
>> >>>>>>>>>>>>>>>>>>>> 7f8a92f9a700
>> >>>>>>>>>>>>>>>>>>>> time
>> >>>>>>>>>>>>>>>>>>>> 2014-08-13 17:52:56.615073
>> >>>>>>>>>>>>>>>>>>>> os/GenericObjectMap.cc: 1118: FAILED assert(start <=
>> >>>>>>>>>>>>>>>>>>>> header.oid)
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385
>> >>>>>>>>>>>>>>>>>>>> 64c36c92b8)
>> >>>>>>>>>>>>>>>>>>>> 1: (GenericObjectMap::list_objects(coll_t const&,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t>
>> >*,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474)
>> >>>>>>>>>>>>>>>>>>>> [0x98f774]
>> >>>>>>>>>>>>>>>>>>>> 2: (KeyValueStore::collection_list_partial(coll_t,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> *,
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54]
>> >>>>>>>>>>>>>>>>>>>> 3: (PGBackend::objects_list_partial(hobject_t const&,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> snapid_t,
>> >>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>> >>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9)
>> >>>>>>>>>>>>>>>>>>>> [0x862de9]
>> >>>>>>>>>>>>>>>>>>>> 4:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+
>> >>>>>>>>>>>>>>>>>>>> 0xea5)
>> >>>>>>>>>>>>>>>>>>>> [0x7f67f5]
>> >>>>>>>>>>>>>>>>>>>> 5:
>> >>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1:
>> >>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3)
>> >>>>>>>>>>>>>>>>>>>> [0x8177b3]
>> >>>>>>>>>>>>>>>>>>>> 6: (ReplicatedPG::do_request(std:
>> >>>>>>>>>>>>>>>>>>>> :tr1::shared_ptr<OpRequest>,
>> >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>> >>>>>>>>>>>>>>>>>>>> 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>> >>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>,
>> >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d)
>> >>>>>>>>>>>>>>>>>>>> [0x62bf8d]
>> >>>>>>>>>>>>>>>>>>>> 8: (OSD::ShardedOpWQ::_process(unsigned int,
>> >>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>> >>>>>>>>>>>>>>>>>>>> 9:
>> (ShardedThreadPool::shardedthreadpool_worker(unsigned
>> >>>>>>>>>>>>>>>>>>>> int)+0x8cd)
>> >>>>>>>>>>>>>>>>>>>> [0xa776fd]
>> >>>>>>>>>>>>>>>>>>>> 10:
>> (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>> >>>>>>>>>>>>>>>>>>>> [0xa79980]
>> >>>>>>>>>>>>>>>>>>>> 11: (()+0x7df3) [0x7f8aac71fdf3]
>> >>>>>>>>>>>>>>>>>>>> 12: (clone()+0x6d) [0x7f8aab1963dd]
>> >>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS
>> >>>>>>>>>>>>>>>>>>>> <executable>`
>> >>>>>>>>>>>>>>>>>>>> is
>> >>>>>>>>>>>>>>>>>>>> needed
>> >>>>>>>>>>>>>>>>>>>> to
>> >>>>>>>>>>>>>>>>>>>> interpret this.
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385
>> >>>>>>>>>>>>>>>>>>>> 64c36c92b8)
>> >>>>>>>>>>>>>>>>>>>> 1: /usr/bin/ceph-osd() [0x99b466]
>> >>>>>>>>>>>>>>>>>>>> 2: (()+0xf130) [0x7f8aac727130]
>> >>>>>>>>>>>>>>>>>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989]
>> >>>>>>>>>>>>>>>>>>>> 4: (abort()+0x148) [0x7f8aab0d7098]
>> >>>>>>>>>>>>>>>>>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165)
>> >>>>>>>>>>>>>>>>>>>> [0x7f8aab9e89d5]
>> >>>>>>>>>>>>>>>>>>>> 6: (()+0x5e946) [0x7f8aab9e6946]
>> >>>>>>>>>>>>>>>>>>>> 7: (()+0x5e973) [0x7f8aab9e6973]
>> >>>>>>>>>>>>>>>>>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f]
>> >>>>>>>>>>>>>>>>>>>> 9: (ceph::__ceph_assert_fail(char const*, char
>> const*,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> char
>> >>>>>>>>>>>>>>>>>>>> const*)+0x1ef) [0xa8805f]
>> >>>>>>>>>>>>>>>>>>>> 10: (GenericObjectMap::list_objects(coll_t const&,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t>
>> >*,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474)
>> >>>>>>>>>>>>>>>>>>>> [0x98f774]
>> >>>>>>>>>>>>>>>>>>>> 11: (KeyValueStore::collection_list_partial(coll_t,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> *,
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54]
>> >>>>>>>>>>>>>>>>>>>> 12: (PGBackend::objects_list_partial(hobject_t
>> const&,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> snapid_t,
>> >>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>> >>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9)
>> >>>>>>>>>>>>>>>>>>>> [0x862de9]
>> >>>>>>>>>>>>>>>>>>>> 13:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+
>> >>>>>>>>>>>>>>>>>>>> 0xea5)
>> >>>>>>>>>>>>>>>>>>>> [0x7f67f5]
>> >>>>>>>>>>>>>>>>>>>> 14:
>> >>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1:
>> >>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3)
>> >>>>>>>>>>>>>>>>>>>> [0x8177b3]
>> >>>>>>>>>>>>>>>>>>>> 15:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>> >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>> >>>>>>>>>>>>>>>>>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>> >>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>,
>> >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d)
>> >>>>>>>>>>>>>>>>>>>> [0x62bf8d]
>> >>>>>>>>>>>>>>>>>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int,
>> >>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>> >>>>>>>>>>>>>>>>>>>> 18:
>> >>>>>>>>>>>>>>>>>>>> (ShardedThreadPool::shardedthreadpool_worker(unsigned
>> >>>>>>>>>>>>>>>>>>>> int)+0x8cd)
>> >>>>>>>>>>>>>>>>>>>> [0xa776fd]
>> >>>>>>>>>>>>>>>>>>>> 19:
>> (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>> >>>>>>>>>>>>>>>>>>>> [0xa79980]
>> >>>>>>>>>>>>>>>>>>>> 20: (()+0x7df3) [0x7f8aac71fdf3]
>> >>>>>>>>>>>>>>>>>>>> 21: (clone()+0x6d) [0x7f8aab1963dd]
>> >>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS
>> >>>>>>>>>>>>>>>>>>>> <executable>`
>> >>>>>>>>>>>>>>>>>>>> is
>> >>>>>>>>>>>>>>>>>>>> needed
>> >>>>>>>>>>>>>>>>>>>> to
>> >>>>>>>>>>>>>>>>>>>> interpret this.
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> --- begin dump of recent events ---
>> >>>>>>>>>>>>>>>>>>>>    0> 2014-08-13 17:52:56.714214 7f8a92f9a700 -1 ***
>> >>>>>>>>>>>>>>>>>>>> Caught
>> >>>>>>>>>>>>>>>>>>>> signal
>> >>>>>>>>>>>>>>>>>>>> (Aborted) **
>> >>>>>>>>>>>>>>>>>>>> in thread 7f8a92f9a700
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385
>> >>>>>>>>>>>>>>>>>>>> 64c36c92b8)
>> >>>>>>>>>>>>>>>>>>>> 1: /usr/bin/ceph-osd() [0x99b466]
>> >>>>>>>>>>>>>>>>>>>> 2: (()+0xf130) [0x7f8aac727130]
>> >>>>>>>>>>>>>>>>>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989]
>> >>>>>>>>>>>>>>>>>>>> 4: (abort()+0x148) [0x7f8aab0d7098]
>> >>>>>>>>>>>>>>>>>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165)
>> >>>>>>>>>>>>>>>>>>>> [0x7f8aab9e89d5]
>> >>>>>>>>>>>>>>>>>>>> 6: (()+0x5e946) [0x7f8aab9e6946]
>> >>>>>>>>>>>>>>>>>>>> 7: (()+0x5e973) [0x7f8aab9e6973]
>> >>>>>>>>>>>>>>>>>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f]
>> >>>>>>>>>>>>>>>>>>>> 9: (ceph::__ceph_assert_fail(char const*, char
>> const*,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> char
>> >>>>>>>>>>>>>>>>>>>> const*)+0x1ef) [0xa8805f]
>> >>>>>>>>>>>>>>>>>>>> 10: (GenericObjectMap::list_objects(coll_t const&,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t>
>> >*,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474)
>> >>>>>>>>>>>>>>>>>>>> [0x98f774]
>> >>>>>>>>>>>>>>>>>>>> 11: (KeyValueStore::collection_list_partial(coll_t,
>> >>>>>>>>>>>>>>>>>>>> ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t,
>> >>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> *,
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54]
>> >>>>>>>>>>>>>>>>>>>> 12: (PGBackend::objects_list_partial(hobject_t
>> const&,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> int,
>> >>>>>>>>>>>>>>>>>>>> snapid_t,
>> >>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>> >>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9)
>> >>>>>>>>>>>>>>>>>>>> [0x862de9]
>> >>>>>>>>>>>>>>>>>>>> 13:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+
>> >>>>>>>>>>>>>>>>>>>> 0xea5)
>> >>>>>>>>>>>>>>>>>>>> [0x7f67f5]
>> >>>>>>>>>>>>>>>>>>>> 14:
>> >>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1:
>> >>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3)
>> >>>>>>>>>>>>>>>>>>>> [0x8177b3]
>> >>>>>>>>>>>>>>>>>>>> 15:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>> >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>> >>>>>>>>>>>>>>>>>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>> >>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>,
>> >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d)
>> >>>>>>>>>>>>>>>>>>>> [0x62bf8d]
>> >>>>>>>>>>>>>>>>>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int,
>> >>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>> >>>>>>>>>>>>>>>>>>>> 18:
>> >>>>>>>>>>>>>>>>>>>> (ShardedThreadPool::shardedthreadpool_worker(unsigned
>> >>>>>>>>>>>>>>>>>>>> int)+0x8cd)
>> >>>>>>>>>>>>>>>>>>>> [0xa776fd]
>> >>>>>>>>>>>>>>>>>>>> 19:
>> (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>> >>>>>>>>>>>>>>>>>>>> [0xa79980]
>> >>>>>>>>>>>>>>>>>>>> 20: (()+0x7df3) [0x7f8aac71fdf3]
>> >>>>>>>>>>>>>>>>>>>> 21: (clone()+0x6d) [0x7f8aab1963dd]
>> >>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS
>> >>>>>>>>>>>>>>>>>>>> <executable>`
>> >>>>>>>>>>>>>>>>>>>> is
>> >>>>>>>>>>>>>>>>>>>> needed
>> >>>>>>>>>>>>>>>>>>>> to
>> >>>>>>>>>>>>>>>>>>>> interpret this.
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> I guess this has something to do with using the dev
>> >>>>>>>>>>>>>>>>>>>> Keyvaluestore?
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> Thanks!
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> Kenneth
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> _______________________________________________
>> >>>>>>>>>>>>>>>>>>>> ceph-users mailing list
>> >>>>>>>>>>>>>>>>>>>> ceph-users at lists.ceph.com
>> >>>>>>>>>>>>>>>>>>>>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>> Best Regards,
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> Wheat
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> ----- End message from Haomai Wang <
>> haomaiwang at gmail.com>
>> >>>>>>>>>>>>>>>>>> -----
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Met vriendelijke groeten,
>> >>>>>>>>>>>>>>>>>> Kenneth Waegeman
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>> Best Regards,
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Wheat
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com
>> >
>> >>>>>>>>>>>>>>>> -----
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Met vriendelijke groeten,
>> >>>>>>>>>>>>>>>> Kenneth Waegeman
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>> Best Regards,
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Wheat
>> >>>>>>>>>>>>>>> _______________________________________________
>> >>>>>>>>>>>>>>> ceph-users mailing list
>> >>>>>>>>>>>>>>> ceph-users at lists.ceph.com
>> >>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> ----- End message from Sage Weil <sweil at redhat.com> -----
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> --
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Met vriendelijke groeten,
>> >
>> >
>> >>>>>>>>>>>>> Kenneth Waegeman
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> --
>> >>>>>>>>>>>> Best Regards,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Wheat
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com>
>> -----
>> >>>>>>>>>>>
>> >>>>>>>>>>> --
>> >>>>>>>>>>>
>> >>>>>>>>>>> Met vriendelijke groeten,
>> >>>>>>>>>>> Kenneth Waegeman
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> --
>> >>>>>>>>>> Best Regards,
>> >>>>>>>>>>
>> >>>>>>>>>> Wheat
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>>
>> >>>>>>>>> Met vriendelijke groeten,
>> >>>>>>>>> Kenneth Waegeman
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Best Regards,
>> >>>>>>>>
>> >>>>>>>> Wheat
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Best Regards,
>> >>>>>>>
>> >>>>>>> Wheat
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>> >>>>>>
>> >>>>>> --
>> >>>>>>
>> >>>>>> Met vriendelijke groeten,
>> >>>>>> Kenneth Waegeman
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Best Regards,
>> >>>>>
>> >>>>> Wheat
>> >>>>>
>> >>>>
>> >>>>
>> >>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>> >>>>
>> >>>> --
>> >>>>
>> >>>> Met vriendelijke groeten,
>> >>>> Kenneth Waegeman
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>>
>> >>> Best Regards,
>> >>>
>> >>> Wheat
>> >>
>> >>
>> >>
>> >> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>> >>
>> >> --
>> >>
>> >> Met vriendelijke groeten,
>> >> Kenneth Waegeman
>> >
>> >
>> >
>> > ----- End message from Kenneth Waegeman <Kenneth.Waegeman at UGent.be>
>> -----
>> >
>> >
>> > --
>> >
>> > Met vriendelijke groeten,
>> > Kenneth Waegeman
>> >
>>
>>
>>
>> --
>> Best Regards,
>>
>> Wheat
>>
>
>
>
> --
>
> Best Regards,
>
> Wheat
>

-- 

Best Regards,

Wheat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140907/82d9ea01/attachment.htm>