I have found the root cause. It's a bug. When chunky scrub happen, it will iterate the who pg's objects and each iterator only a few objects will be scan. osd/PG.cc:3758 ret = get_pgbackend()-> objects_list_partial( start, cct->_conf->osd_scrub_chunk_min, cct->_conf->osd_scrub_chunk_max, 0, &objects, &candidate_end); candidate_end is the end of object set and it's used to indicate the next scrub process's start position. But it will be truncated: osd/PG.cc:3777 while (!boundary_found && objects.size() > 1) { hobject_t end = objects.back().get_boundary(); objects.pop_back(); if (objects.back().get_filestore_key() != end.get_filestore_key()) { candidate_end = end; boundary_found = true; } } end which only contain "hash" field as hobject_t will be assign to candidate_end. So the next scrub process a hobject_t only contains "hash" field will be passed in to get_pgbackend()-> objects_list_partial. It will cause incorrect results for KeyValueStore backend. Because it will use strict key ordering for "collection_list_paritial" method. A hobject_t only contains "hash" field will be: 1%e79s0_head!972F1B5D!!none!!!00000000000000000000!0!0 and the actual object is 1%e79s0_head!972F1B5D!!1!!!object-name!head In other word, a object only contain "hash" field can't used by to search a absolute object has the same "hash" field. @sage, I simply scan the usage of "get_boundary" and can't find the reason. Could we simply remove it and the results will be: while (!boundary_found && objects.size() > 1) { hobject_t end = objects.back(); objects.pop_back(); if (objects.back().get_filestore_key() != end.get_filestore_key()) { candidate_end = end; boundary_found = true; } } On Sat, Sep 6, 2014 at 10:44 PM, Haomai Wang <haomaiwang at gmail.com> wrote: > Sorry for the late message, I'm back from a short vacation. I would > like to try it this weekends. Thanks for your patient :-) > > On Wed, Sep 3, 2014 at 9:16 PM, Kenneth Waegeman > <Kenneth.Waegeman at ugent.be> wrote: > > I also can reproduce it on a new slightly different set up (also EC on KV > > and Cache) by running ceph pg scrub on a KV pg: this pg will then get the > > 'inconsistent' status > > > > > > > > ----- Message from Kenneth Waegeman <Kenneth.Waegeman at UGent.be> > --------- > > Date: Mon, 01 Sep 2014 16:28:31 +0200 > > From: Kenneth Waegeman <Kenneth.Waegeman at UGent.be> > > Subject: Re: ceph cluster inconsistency keyvaluestore > > To: Haomai Wang <haomaiwang at gmail.com> > > Cc: ceph-users at lists.ceph.com > > > > > > > >> Hi, > >> > >> > >> The cluster got installed with quattor, which uses ceph-deploy for > >> installation of daemons, writes the config file and installs the > crushmap. > >> I have 3 hosts, each 12 disks, having a large KV partition (3.6T) for > the > >> ECdata pool and a small cache partition (50G) for the cache > >> > >> I manually did this: > >> > >> ceph osd pool create cache 1024 1024 > >> ceph osd pool set cache size 2 > >> ceph osd pool set cache min_size 1 > >> ceph osd erasure-code-profile set profile11 k=8 m=3 > >> ruleset-failure-domain=osd > >> ceph osd pool create ecdata 128 128 erasure profile11 > >> ceph osd tier add ecdata cache > >> ceph osd tier cache-mode cache writeback > >> ceph osd tier set-overlay ecdata cache > >> ceph osd pool set cache hit_set_type bloom > >> ceph osd pool set cache hit_set_count 1 > >> ceph osd pool set cache hit_set_period 3600 > >> ceph osd pool set cache target_max_bytes $((280*1024*1024*1024)) > >> > >> (But the previous time I had the problem already without the cache part) > >> > >> > >> > >> Cluster live since 2014-08-29 15:34:16 > >> > >> Config file on host ceph001: > >> > >> [global] > >> auth_client_required = cephx > >> auth_cluster_required = cephx > >> auth_service_required = cephx > >> cluster_network = 10.143.8.0/24 > >> filestore_xattr_use_omap = 1 > >> fsid = 82766e04-585b-49a6-a0ac-c13d9ffd0a7d > >> mon_cluster_log_to_syslog = 1 > >> mon_host = ceph001.cubone.os, ceph002.cubone.os, ceph003.cubone.os > >> mon_initial_members = ceph001, ceph002, ceph003 > >> osd_crush_update_on_start = 0 > >> osd_journal_size = 10240 > >> osd_pool_default_min_size = 2 > >> osd_pool_default_pg_num = 512 > >> osd_pool_default_pgp_num = 512 > >> osd_pool_default_size = 3 > >> public_network = 10.141.8.0/24 > >> > >> [osd.11] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.13] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.15] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.17] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.19] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.21] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.23] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.25] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.3] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.5] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.7] > >> osd_objectstore = keyvaluestore-dev > >> > >> [osd.9] > >> osd_objectstore = keyvaluestore-dev > >> > >> > >> OSDs: > >> # id weight type name up/down reweight > >> -12 140.6 root default-cache > >> -9 46.87 host ceph001-cache > >> 2 3.906 osd.2 up 1 > >> 4 3.906 osd.4 up 1 > >> 6 3.906 osd.6 up 1 > >> 8 3.906 osd.8 up 1 > >> 10 3.906 osd.10 up 1 > >> 12 3.906 osd.12 up 1 > >> 14 3.906 osd.14 up 1 > >> 16 3.906 osd.16 up 1 > >> 18 3.906 osd.18 up 1 > >> 20 3.906 osd.20 up 1 > >> 22 3.906 osd.22 up 1 > >> 24 3.906 osd.24 up 1 > >> -10 46.87 host ceph002-cache > >> 28 3.906 osd.28 up 1 > >> 30 3.906 osd.30 up 1 > >> 32 3.906 osd.32 up 1 > >> 34 3.906 osd.34 up 1 > >> 36 3.906 osd.36 up 1 > >> 38 3.906 osd.38 up 1 > >> 40 3.906 osd.40 up 1 > >> 42 3.906 osd.42 up 1 > >> 44 3.906 osd.44 up 1 > >> 46 3.906 osd.46 up 1 > >> 48 3.906 osd.48 up 1 > >> 50 3.906 osd.50 up 1 > >> -11 46.87 host ceph003-cache > >> 54 3.906 osd.54 up 1 > >> 56 3.906 osd.56 up 1 > >> 58 3.906 osd.58 up 1 > >> 60 3.906 osd.60 up 1 > >> 62 3.906 osd.62 up 1 > >> 64 3.906 osd.64 up 1 > >> 66 3.906 osd.66 up 1 > >> 68 3.906 osd.68 up 1 > >> 70 3.906 osd.70 up 1 > >> 72 3.906 osd.72 up 1 > >> 74 3.906 osd.74 up 1 > >> 76 3.906 osd.76 up 1 > >> -8 140.6 root default-ec > >> -5 46.87 host ceph001-ec > >> 3 3.906 osd.3 up 1 > >> 5 3.906 osd.5 up 1 > >> 7 3.906 osd.7 up 1 > >> 9 3.906 osd.9 up 1 > >> 11 3.906 osd.11 up 1 > >> 13 3.906 osd.13 up 1 > >> 15 3.906 osd.15 up 1 > >> 17 3.906 osd.17 up 1 > >> 19 3.906 osd.19 up 1 > >> 21 3.906 osd.21 up 1 > >> 23 3.906 osd.23 up 1 > >> 25 3.906 osd.25 up 1 > >> -6 46.87 host ceph002-ec > >> 29 3.906 osd.29 up 1 > >> 31 3.906 osd.31 up 1 > >> 33 3.906 osd.33 up 1 > >> 35 3.906 osd.35 up 1 > >> 37 3.906 osd.37 up 1 > >> 39 3.906 osd.39 up 1 > >> 41 3.906 osd.41 up 1 > >> 43 3.906 osd.43 up 1 > >> 45 3.906 osd.45 up 1 > >> 47 3.906 osd.47 up 1 > >> 49 3.906 osd.49 up 1 > >> 51 3.906 osd.51 up 1 > >> -7 46.87 host ceph003-ec > >> 55 3.906 osd.55 up 1 > >> 57 3.906 osd.57 up 1 > >> 59 3.906 osd.59 up 1 > >> 61 3.906 osd.61 up 1 > >> 63 3.906 osd.63 up 1 > >> 65 3.906 osd.65 up 1 > >> 67 3.906 osd.67 up 1 > >> 69 3.906 osd.69 up 1 > >> 71 3.906 osd.71 up 1 > >> 73 3.906 osd.73 up 1 > >> 75 3.906 osd.75 up 1 > >> 77 3.906 osd.77 up 1 > >> -4 23.44 root default-ssd > >> -1 7.812 host ceph001-ssd > >> 0 3.906 osd.0 up 1 > >> 1 3.906 osd.1 up 1 > >> -2 7.812 host ceph002-ssd > >> 26 3.906 osd.26 up 1 > >> 27 3.906 osd.27 up 1 > >> -3 7.812 host ceph003-ssd > >> 52 3.906 osd.52 up 1 > >> 53 3.906 osd.53 up 1 > >> > >> Cache OSDs are each 50G, the EC KV OSDS 3.6T, (ssds not used right now) > >> > >> Pools: > >> pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash > >> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool > stripe_width 0 > >> pool 1 'cache' replicated size 2 min_size 1 crush_ruleset 0 object_hash > >> rjenkins pg_num 1024 pgp_num 1024 last_change 174 flags > >> hashpspool,incomplete_clones tier_of 2 cache_mode writeback target_bytes > >> 300647710720 hit_set bloom{false_positive_probability: 0.05, > target_size: 0, > >> seed: 0} 3600s x1 stripe_width 0 > >> pool 2 'ecdata' erasure size 11 min_size 8 crush_ruleset 2 object_hash > >> rjenkins pg_num 128 pgp_num 128 last_change 170 lfor 170 flags > hashpspool > >> tiers 1 read_tier 1 write_tier 1 stripe_width 4096 > >> > >> > >> Crushmap: > >> # begin crush map > >> tunable choose_local_fallback_tries 0 > >> tunable choose_local_tries 0 > >> tunable choose_total_tries 50 > >> tunable chooseleaf_descend_once 1 > >> > >> # devices > >> device 0 osd.0 > >> device 1 osd.1 > >> device 2 osd.2 > >> device 3 osd.3 > >> device 4 osd.4 > >> device 5 osd.5 > >> device 6 osd.6 > >> device 7 osd.7 > >> device 8 osd.8 > >> device 9 osd.9 > >> device 10 osd.10 > >> device 11 osd.11 > >> device 12 osd.12 > >> device 13 osd.13 > >> device 14 osd.14 > >> device 15 osd.15 > >> device 16 osd.16 > >> device 17 osd.17 > >> device 18 osd.18 > >> device 19 osd.19 > >> device 20 osd.20 > >> device 21 osd.21 > >> device 22 osd.22 > >> device 23 osd.23 > >> device 24 osd.24 > >> device 25 osd.25 > >> device 26 osd.26 > >> device 27 osd.27 > >> device 28 osd.28 > >> device 29 osd.29 > >> device 30 osd.30 > >> device 31 osd.31 > >> device 32 osd.32 > >> device 33 osd.33 > >> device 34 osd.34 > >> device 35 osd.35 > >> device 36 osd.36 > >> device 37 osd.37 > >> device 38 osd.38 > >> device 39 osd.39 > >> device 40 osd.40 > >> device 41 osd.41 > >> device 42 osd.42 > >> device 43 osd.43 > >> device 44 osd.44 > >> device 45 osd.45 > >> device 46 osd.46 > >> device 47 osd.47 > >> device 48 osd.48 > >> device 49 osd.49 > >> device 50 osd.50 > >> device 51 osd.51 > >> device 52 osd.52 > >> device 53 osd.53 > >> device 54 osd.54 > >> device 55 osd.55 > >> device 56 osd.56 > >> device 57 osd.57 > >> device 58 osd.58 > >> device 59 osd.59 > >> device 60 osd.60 > >> device 61 osd.61 > >> device 62 osd.62 > >> device 63 osd.63 > >> device 64 osd.64 > >> device 65 osd.65 > >> device 66 osd.66 > >> device 67 osd.67 > >> device 68 osd.68 > >> device 69 osd.69 > >> device 70 osd.70 > >> device 71 osd.71 > >> device 72 osd.72 > >> device 73 osd.73 > >> device 74 osd.74 > >> device 75 osd.75 > >> device 76 osd.76 > >> device 77 osd.77 > >> > >> # types > >> type 0 osd > >> type 1 host > >> type 2 root > >> > >> # buckets > >> host ceph001-ssd { > >> id -1 # do not change unnecessarily > >> # weight 7.812 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.0 weight 3.906 > >> item osd.1 weight 3.906 > >> } > >> host ceph002-ssd { > >> id -2 # do not change unnecessarily > >> # weight 7.812 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.26 weight 3.906 > >> item osd.27 weight 3.906 > >> } > >> host ceph003-ssd { > >> id -3 # do not change unnecessarily > >> # weight 7.812 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.52 weight 3.906 > >> item osd.53 weight 3.906 > >> } > >> root default-ssd { > >> id -4 # do not change unnecessarily > >> # weight 23.436 > >> alg straw > >> hash 0 # rjenkins1 > >> item ceph001-ssd weight 7.812 > >> item ceph002-ssd weight 7.812 > >> item ceph003-ssd weight 7.812 > >> } > >> host ceph001-ec { > >> id -5 # do not change unnecessarily > >> # weight 46.872 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.3 weight 3.906 > >> item osd.5 weight 3.906 > >> item osd.7 weight 3.906 > >> item osd.9 weight 3.906 > >> item osd.11 weight 3.906 > >> item osd.13 weight 3.906 > >> item osd.15 weight 3.906 > >> item osd.17 weight 3.906 > >> item osd.19 weight 3.906 > >> item osd.21 weight 3.906 > >> item osd.23 weight 3.906 > >> item osd.25 weight 3.906 > >> } > >> host ceph002-ec { > >> id -6 # do not change unnecessarily > >> # weight 46.872 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.29 weight 3.906 > >> item osd.31 weight 3.906 > >> item osd.33 weight 3.906 > >> item osd.35 weight 3.906 > >> item osd.37 weight 3.906 > >> item osd.39 weight 3.906 > >> item osd.41 weight 3.906 > >> item osd.43 weight 3.906 > >> item osd.45 weight 3.906 > >> item osd.47 weight 3.906 > >> item osd.49 weight 3.906 > >> item osd.51 weight 3.906 > >> } > >> host ceph003-ec { > >> id -7 # do not change unnecessarily > >> # weight 46.872 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.55 weight 3.906 > >> item osd.57 weight 3.906 > >> item osd.59 weight 3.906 > >> item osd.61 weight 3.906 > >> item osd.63 weight 3.906 > >> item osd.65 weight 3.906 > >> item osd.67 weight 3.906 > >> item osd.69 weight 3.906 > >> item osd.71 weight 3.906 > >> item osd.73 weight 3.906 > >> item osd.75 weight 3.906 > >> item osd.77 weight 3.906 > >> } > >> root default-ec { > >> id -8 # do not change unnecessarily > >> # weight 140.616 > >> alg straw > >> hash 0 # rjenkins1 > >> item ceph001-ec weight 46.872 > >> item ceph002-ec weight 46.872 > >> item ceph003-ec weight 46.872 > >> } > >> host ceph001-cache { > >> id -9 # do not change unnecessarily > >> # weight 46.872 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.2 weight 3.906 > >> item osd.4 weight 3.906 > >> item osd.6 weight 3.906 > >> item osd.8 weight 3.906 > >> item osd.10 weight 3.906 > >> item osd.12 weight 3.906 > >> item osd.14 weight 3.906 > >> item osd.16 weight 3.906 > >> item osd.18 weight 3.906 > >> item osd.20 weight 3.906 > >> item osd.22 weight 3.906 > >> item osd.24 weight 3.906 > >> } > >> host ceph002-cache { > >> id -10 # do not change unnecessarily > >> # weight 46.872 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.28 weight 3.906 > >> item osd.30 weight 3.906 > >> item osd.32 weight 3.906 > >> item osd.34 weight 3.906 > >> item osd.36 weight 3.906 > >> item osd.38 weight 3.906 > >> item osd.40 weight 3.906 > >> item osd.42 weight 3.906 > >> item osd.44 weight 3.906 > >> item osd.46 weight 3.906 > >> item osd.48 weight 3.906 > >> item osd.50 weight 3.906 > >> } > >> host ceph003-cache { > >> id -11 # do not change unnecessarily > >> # weight 46.872 > >> alg straw > >> hash 0 # rjenkins1 > >> item osd.54 weight 3.906 > >> item osd.56 weight 3.906 > >> item osd.58 weight 3.906 > >> item osd.60 weight 3.906 > >> item osd.62 weight 3.906 > >> item osd.64 weight 3.906 > >> item osd.66 weight 3.906 > >> item osd.68 weight 3.906 > >> item osd.70 weight 3.906 > >> item osd.72 weight 3.906 > >> item osd.74 weight 3.906 > >> item osd.76 weight 3.906 > >> } > >> root default-cache { > >> id -12 # do not change unnecessarily > >> # weight 140.616 > >> alg straw > >> hash 0 # rjenkins1 > >> item ceph001-cache weight 46.872 > >> item ceph002-cache weight 46.872 > >> item ceph003-cache weight 46.872 > >> } > >> > >> # rules > >> rule cache { > >> ruleset 0 > >> type replicated > >> min_size 1 > >> max_size 10 > >> step take default-cache > >> step chooseleaf firstn 0 type host > >> step emit > >> } > >> rule metadata { > >> ruleset 1 > >> type replicated > >> min_size 1 > >> max_size 10 > >> step take default-ssd > >> step chooseleaf firstn 0 type host > >> step emit > >> } > >> rule ecdata { > >> ruleset 2 > >> type erasure > >> min_size 3 > >> max_size 20 > >> step set_chooseleaf_tries 5 > >> step take default-ec > >> step choose indep 0 type osd > >> step emit > >> } > >> > >> # end crush map > >> > >> The benchmarks I then did: > >> > >> ./benchrw 50000 > >> > >> benchrw: > >> /usr/bin/rados -p ecdata bench $1 write --no-cleanup > >> /usr/bin/rados -p ecdata bench $1 seq > >> /usr/bin/rados -p ecdata bench $1 seq & > >> /usr/bin/rados -p ecdata bench $1 write --no-cleanup > >> > >> > >> Srubbing errors started soon after that: 2014-08-31 10:59:14 > >> > >> > >> Please let me know if you need more information, and thanks ! > >> > >> Kenneth > >> > >> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- > >> Date: Mon, 1 Sep 2014 21:30:16 +0800 > >> From: Haomai Wang <haomaiwang at gmail.com> > >> Subject: Re: ceph cluster inconsistency keyvaluestore > >> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> > >> Cc: ceph-users at lists.ceph.com > >> > >> > >>> Hmm, could you please list your instructions including cluster existing > >>> time and all relevant ops? I want to reproduce it. > >>> > >>> > >>> On Mon, Sep 1, 2014 at 4:45 PM, Kenneth Waegeman > >>> <Kenneth.Waegeman at ugent.be> > >>> wrote: > >>> > >>>> Hi, > >>>> > >>>> I reinstalled the cluster with 0.84, and tried again running rados > bench > >>>> on a EC coded pool on keyvaluestore. > >>>> Nothing crashed this time, but when I check the status: > >>>> > >>>> health HEALTH_ERR 128 pgs inconsistent; 128 scrub errors; too few > >>>> pgs > >>>> per osd (15 < min 20) > >>>> monmap e1: 3 mons at {ceph001=10.141.8.180:6789/0, > >>>> ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, election > epoch > >>>> 8, quorum 0,1,2 ceph001,ceph002,ceph003 > >>>> osdmap e174: 78 osds: 78 up, 78 in > >>>> pgmap v147680: 1216 pgs, 3 pools, 14758 GB data, 3690 kobjects > >>>> 1753 GB used, 129 TB / 131 TB avail > >>>> 1088 active+clean > >>>> 128 active+clean+inconsistent > >>>> > >>>> the 128 inconsistent pgs are ALL the pgs of the EC KV store ( the > others > >>>> are on Filestore) > >>>> > >>>> The only thing I can see in the logs is that after the rados tests, it > >>>> start scrubbing, and for each KV pg I get something like this: > >>>> > >>>> 2014-08-31 11:14:09.050747 osd.11 10.141.8.180:6833/61098 4 : [ERR] > >>>> 2.3s0 > >>>> scrub stat mismatch, got 28164/29291 objects, 0/0 clones, 28164/29291 > >>>> dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts, > >>>> 118128377856/122855358464 bytes. > >>>> > >>>> What could here be the problem? > >>>> Thanks again!! > >>>> > >>>> Kenneth > >>>> > >>>> > >>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- > >>>> Date: Tue, 26 Aug 2014 17:11:43 +0800 > >>>> From: Haomai Wang <haomaiwang at gmail.com> > >>>> Subject: Re: ceph cluster inconsistency? > >>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> > >>>> Cc: ceph-users at lists.ceph.com > >>>> > >>>> > >>>> Hmm, it looks like you hit this > >>>> bug(http://tracker.ceph.com/issues/9223). > >>>>> > >>>>> > >>>>> Sorry for the late message, I forget that this fix is merged into > 0.84. > >>>>> > >>>>> Thanks for your patient :-) > >>>>> > >>>>> On Tue, Aug 26, 2014 at 4:39 PM, Kenneth Waegeman > >>>>> <Kenneth.Waegeman at ugent.be> wrote: > >>>>> > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> In the meantime I already tried with upgrading the cluster to 0.84, > to > >>>>>> see > >>>>>> if that made a difference, and it seems it does. > >>>>>> I can't reproduce the crashing osds by doing a 'rados -p ecdata ls' > >>>>>> anymore. > >>>>>> > >>>>>> But now the cluster detect it is inconsistent: > >>>>>> > >>>>>> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d > >>>>>> health HEALTH_ERR 40 pgs inconsistent; 40 scrub errors; too > few > >>>>>> pgs > >>>>>> per osd (4 < min 20); mon.ceph002 low disk space > >>>>>> monmap e3: 3 mons at > >>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0, > >>>>>> ceph003=10.141.8.182:6789/0}, > >>>>>> election epoch 30, quorum 0,1,2 ceph001,ceph002,ceph003 > >>>>>> mdsmap e78951: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3 > >>>>>> up:standby > >>>>>> osdmap e145384: 78 osds: 78 up, 78 in > >>>>>> pgmap v247095: 320 pgs, 4 pools, 15366 GB data, 3841 kobjects > >>>>>> 1502 GB used, 129 TB / 131 TB avail > >>>>>> 279 active+clean > >>>>>> 40 active+clean+inconsistent > >>>>>> 1 active+clean+scrubbing+deep > >>>>>> > >>>>>> > >>>>>> I tried to do ceph pg repair for all the inconsistent pgs: > >>>>>> > >>>>>> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d > >>>>>> health HEALTH_ERR 40 pgs inconsistent; 1 pgs repair; 40 scrub > >>>>>> errors; > >>>>>> too few pgs per osd (4 < min 20); mon.ceph002 low disk space > >>>>>> monmap e3: 3 mons at > >>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0, > >>>>>> ceph003=10.141.8.182:6789/0}, > >>>>>> election epoch 30, quorum 0,1,2 ceph001,ceph002,ceph003 > >>>>>> mdsmap e79486: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3 > >>>>>> up:standby > >>>>>> osdmap e146452: 78 osds: 78 up, 78 in > >>>>>> pgmap v248520: 320 pgs, 4 pools, 15366 GB data, 3841 kobjects > >>>>>> 1503 GB used, 129 TB / 131 TB avail > >>>>>> 279 active+clean > >>>>>> 39 active+clean+inconsistent > >>>>>> 1 active+clean+scrubbing+deep > >>>>>> 1 > active+clean+scrubbing+deep+inconsistent+repair > >>>>>> > >>>>>> I let it recovering through the night, but this morning the mons > were > >>>>>> all > >>>>>> gone, nothing to see in the log files.. The osds were all still up! > >>>>>> > >>>>>> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d > >>>>>> health HEALTH_ERR 36 pgs inconsistent; 1 pgs repair; 36 scrub > >>>>>> errors; > >>>>>> too few pgs per osd (4 < min 20) > >>>>>> monmap e7: 3 mons at > >>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0, > >>>>>> ceph003=10.141.8.182:6789/0}, > >>>>>> election epoch 44, quorum 0,1,2 ceph001,ceph002,ceph003 > >>>>>> mdsmap e109481: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3 > >>>>>> up:standby > >>>>>> osdmap e203410: 78 osds: 78 up, 78 in > >>>>>> pgmap v331747: 320 pgs, 4 pools, 15251 GB data, 3812 kobjects > >>>>>> 1547 GB used, 129 TB / 131 TB avail > >>>>>> 1 active+clean+scrubbing+deep+inconsistent+repair > >>>>>> 284 active+clean > >>>>>> 35 active+clean+inconsistent > >>>>>> > >>>>>> I restarted the monitors now, I will let you know when I see > something > >>>>>> more.. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- > >>>>>> Date: Sun, 24 Aug 2014 12:51:41 +0800 > >>>>>> > >>>>>> From: Haomai Wang <haomaiwang at gmail.com> > >>>>>> Subject: Re: ceph cluster inconsistency? > >>>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>, > >>>>>> ceph-users at lists.ceph.com > >>>>>> > >>>>>> > >>>>>> It's really strange! I write a test program according the key > ordering > >>>>>>> > >>>>>>> you provided and parse the corresponding value. It's true! > >>>>>>> > >>>>>>> I have no idea now. If free, could you add this debug code to > >>>>>>> "src/os/GenericObjectMap.cc" and insert *before* "assert(start <= > >>>>>>> header.oid);": > >>>>>>> > >>>>>>> dout(0) << "start: " << start << "header.oid: " << header.oid << > >>>>>>> dendl; > >>>>>>> > >>>>>>> Then you need to recompile ceph-osd and run it again. The output > log > >>>>>>> can help it! > >>>>>>> > >>>>>>> On Tue, Aug 19, 2014 at 10:19 PM, Haomai Wang < > haomaiwang at gmail.com> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> > >>>>>>>> I feel a little embarrassed, 1024 rows still true for me. > >>>>>>>> > >>>>>>>> I was wondering if you could give your all keys via > >>>>>>>> ""ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list > >>>>>>>> _GHOBJTOSEQ_ > keys.log?. > >>>>>>>> > >>>>>>>> thanks! > >>>>>>>> > >>>>>>>> On Tue, Aug 19, 2014 at 4:58 PM, Kenneth Waegeman > >>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: > >>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- > >>>>>>>>> Date: Tue, 19 Aug 2014 12:28:27 +0800 > >>>>>>>>> > >>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com> > >>>>>>>>> Subject: Re: ceph cluster inconsistency? > >>>>>>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> > >>>>>>>>> Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Mon, Aug 18, 2014 at 7:32 PM, Kenneth Waegeman > >>>>>>>>>> > >>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> > --------- > >>>>>>>>>>> Date: Mon, 18 Aug 2014 18:34:11 +0800 > >>>>>>>>>>> > >>>>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com> > >>>>>>>>>>> Subject: Re: ceph cluster inconsistency? > >>>>>>>>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> > >>>>>>>>>>> Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Mon, Aug 18, 2014 at 5:38 PM, Kenneth Waegeman > >>>>>>>>>>>> > >>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Hi, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I tried this after restarting the osd, but I guess that was > not > >>>>>>>>>>>>> the > >>>>>>>>>>>>> aim > >>>>>>>>>>>>> ( > >>>>>>>>>>>>> # ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list > >>>>>>>>>>>>> _GHOBJTOSEQ_| > >>>>>>>>>>>>> grep 6adb1100 -A 100 > >>>>>>>>>>>>> IO error: lock /var/lib/ceph/osd/ceph-67/current//LOCK: > >>>>>>>>>>>>> Resource > >>>>>>>>>>>>> temporarily > >>>>>>>>>>>>> unavailable > >>>>>>>>>>>>> tools/ceph_kvstore_tool.cc: In function > >>>>>>>>>>>>> 'StoreTool::StoreTool(const > >>>>>>>>>>>>> string&)' thread 7f8fecf7d780 time 2014-08-18 11:12:29.551780 > >>>>>>>>>>>>> tools/ceph_kvstore_tool.cc: 38: FAILED > >>>>>>>>>>>>> assert(!db_ptr->open(std::cerr)) > >>>>>>>>>>>>> .. > >>>>>>>>>>>>> ) > >>>>>>>>>>>>> > >>>>>>>>>>>>> When I run it after bringing the osd down, it takes a while, > >>>>>>>>>>>>> but > >>>>>>>>>>>>> it > >>>>>>>>>>>>> has > >>>>>>>>>>>>> no > >>>>>>>>>>>>> output.. (When running it without the grep, I'm getting a > huge > >>>>>>>>>>>>> list > >>>>>>>>>>>>> ) > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Oh, sorry for it! I made a mistake, the hash value(6adb1100) > >>>>>>>>>>>> will > >>>>>>>>>>>> be > >>>>>>>>>>>> reversed into leveldb. > >>>>>>>>>>>> So grep "benchmark_data_ceph001.cubone.os_5560_object789734" > >>>>>>>>>>>> should > >>>>>>>>>>>> be > >>>>>>>>>>>> help it. > >>>>>>>>>>>> > >>>>>>>>>>>> this gives: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> [root at ceph003 ~]# ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/ > >>>>>>>>>>> current/ > >>>>>>>>>>> list > >>>>>>>>>>> _GHOBJTOSEQ_ | grep 5560_object789734 -A 100 > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011BDA6!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object789734!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011C027!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1330170!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011C6FD!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object227366!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011CB03!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1363631!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011CDF0!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1573957!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011D02C!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1019282!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E2B5!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1283563!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E511!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object273736!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E547!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1170628!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011EAAB!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object256335!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011F446!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1484196!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011FC59!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object884178!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001203F3!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object853746!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001208E3!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object36633!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00120B37!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1235337!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001210B6!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1661351!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001210CB!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object238126!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012184C!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object339943!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00121916!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1047094!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001219C1!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object520642!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001222BB!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object639565!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001223AA!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object231080!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012243C!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object858050!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012289C!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object241796!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122D28!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object7462!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122DFE!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object243798!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122EFC!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_8961_object109512!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001232D7!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object653973!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001234A3!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1378169!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123714!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object512925!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001237D9!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object23289!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123854!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1108852!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123971!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object704026!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123F75!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_8961_object250441!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124083!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object706178!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001240FA!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object316952!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012447D!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object538734!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001244D9!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object789215!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001247CD!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_8961_object265993!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124897!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object610597!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124BE4!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object691723!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124C9B!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1306135!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124E1D!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object520580!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012534C!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object659767!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00125A81!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object184060!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00125E77!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1292867!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00126562!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1201410!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00126B34!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1657326!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127383!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1269787!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127396!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object500115!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001277F8!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object394932!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001279DD!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object252963!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127B40!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object936811!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127BAC!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1481773!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012894E!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object999885!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00128D05!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object943667!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012908A!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object212990!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129519!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object437596!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129716!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1585330!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129798!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object603505!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001299C9!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object808800!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129B7A!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object23193!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129B9A!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1158397!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012A932!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object542450!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012B77A!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_8961_object195480!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012BE8C!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object312911!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012BF74!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1563783!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012C65C!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1123980!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012C6FE!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_3411_object913!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012CCAD!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object400863!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012CDBB!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object789667!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012D14B!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1020723!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012D95B!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_8961_object106293!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012E3C8!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1355526!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012E5B3!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1491348!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012F2BB!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_8961_object338872!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012F374!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1337264!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FBE5!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1512395!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FCE3!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_8961_object298610!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FEB6!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object120824!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001301CA!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object816326!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00130263!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object777163!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00130529!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1413173!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001317D9!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object809510!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0013204F!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object471416!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132400!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object695087!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132A19!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object591945!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132BF8!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object302000!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132F5B!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1645443!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00133B8B!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object761911!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0013433E!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object1467727!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134446!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object791960!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134678!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object677078!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134A96!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object254923!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001355D0!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_31461_object321528!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135690!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4919_object36935!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135B62!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object1228272!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135C72!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_4812_object2180!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135DEE!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object425705!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00136366!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object141569!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00136371!!3!!benchmark_data_ > >>>>>>>>>>> ceph001%ecubone%eos_5560_object564213!head > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> 100 rows seemed true for me. I found the min list objects is > 1024. > >>>>>>>>>> Please could you run > >>>>>>>>>> "ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list > >>>>>>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 1024" > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I got the output in attachment > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>>>> Or should I run this immediately after the osd is crashed, > >>>>>>>>>>>>> (because > >>>>>>>>>>>>> it > >>>>>>>>>>>>> maybe > >>>>>>>>>>>>> rebalanced? I did already restarted the cluster) > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> I don't know if it is related, but before I could all do > that, > >>>>>>>>>>>>> I > >>>>>>>>>>>>> had > >>>>>>>>>>>>> to > >>>>>>>>>>>>> fix > >>>>>>>>>>>>> something else: A monitor did run out if disk space, using > 8GB > >>>>>>>>>>>>> for > >>>>>>>>>>>>> his > >>>>>>>>>>>>> store.db folder (lot of sst files). Other monitors are also > >>>>>>>>>>>>> near > >>>>>>>>>>>>> that > >>>>>>>>>>>>> level. > >>>>>>>>>>>>> Never had that problem on previous setups before. I > recreated a > >>>>>>>>>>>>> monitor > >>>>>>>>>>>>> and > >>>>>>>>>>>>> now it uses 3.8GB. > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> It exists some duplicate data which needed to be compacted. > >>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> Another idea, maybe you can make KeyValueStore's stripe size > >>>>>>>>>>>> align > >>>>>>>>>>>> with EC stripe size. > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> How can I do that? Is there some documentation about that? > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> ceph --show-config | grep keyvaluestore > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> debug_keyvaluestore = 0/0 > >>>>>>>>>> keyvaluestore_queue_max_ops = 50 > >>>>>>>>>> keyvaluestore_queue_max_bytes = 104857600 > >>>>>>>>>> keyvaluestore_debug_check_backend = false > >>>>>>>>>> keyvaluestore_op_threads = 2 > >>>>>>>>>> keyvaluestore_op_thread_timeout = 60 > >>>>>>>>>> keyvaluestore_op_thread_suicide_timeout = 180 > >>>>>>>>>> keyvaluestore_default_strip_size = 4096 > >>>>>>>>>> keyvaluestore_max_expected_write_size = 16777216 > >>>>>>>>>> keyvaluestore_header_cache_size = 4096 > >>>>>>>>>> keyvaluestore_backend = leveldb > >>>>>>>>>> > >>>>>>>>>> keyvaluestore_default_strip_size is the wanted > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I haven't think deeply and maybe I will try it later. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks! > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Kenneth > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> ----- Message from Sage Weil <sweil at redhat.com> --------- > >>>>>>>>>>>>> Date: Fri, 15 Aug 2014 06:10:34 -0700 (PDT) > >>>>>>>>>>>>> From: Sage Weil <sweil at redhat.com> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency? > >>>>>>>>>>>>> To: Haomai Wang <haomaiwang at gmail.com> > >>>>>>>>>>>>> Cc: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>, > >>>>>>>>>>>>> ceph-users at lists.ceph.com > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Fri, 15 Aug 2014, Haomai Wang wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi Kenneth, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I don't find valuable info in your logs, it lack of the > >>>>>>>>>>>>>>> necessary > >>>>>>>>>>>>>>> debug output when accessing crash code. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> But I scan the encode/decode implementation in > >>>>>>>>>>>>>>> GenericObjectMap > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>> find something bad. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> For example, two oid has same hash and their name is: > >>>>>>>>>>>>>>> A: "rb.data.123" > >>>>>>>>>>>>>>> B: "rb-123" > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> In ghobject_t compare level, A < B. But GenericObjectMap > >>>>>>>>>>>>>>> encode > >>>>>>>>>>>>>>> "." > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>> "%e", so the key in DB is: > >>>>>>>>>>>>>>> A: _GHOBJTOSEQ_:blah!51615000!!none!!rb%edata%e123!head > >>>>>>>>>>>>>>> B: _GHOBJTOSEQ_:blah!51615000!!none!!rb-123!head > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> A > B > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> And it seemed that the escape function is useless and > should > >>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>> disabled. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I'm not sure whether Kenneth's problem is touching this > bug. > >>>>>>>>>>>>>>> Because > >>>>>>>>>>>>>>> this scene only occur when the object set is very large and > >>>>>>>>>>>>>>> make > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>> two object has same hash value. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Kenneth, could you have time to run "ceph-kv-store > >>>>>>>>>>>>>>> [path-to-osd] > >>>>>>>>>>>>>>> list > >>>>>>>>>>>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 100". ceph-kv-store is a > debug > >>>>>>>>>>>>>>> tool > >>>>>>>>>>>>>>> which can be compiled from source. You can clone ceph repo > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>> run > >>>>>>>>>>>>>>> "./authongen.sh; ./configure; cd src; make > >>>>>>>>>>>>>>> ceph-kvstore-tool". > >>>>>>>>>>>>>>> "path-to-osd" should be "/var/lib/ceph/osd-[id]/current/". > >>>>>>>>>>>>>>> "6adb1100" > >>>>>>>>>>>>>>> is from your verbose log and the next 100 rows should know > >>>>>>>>>>>>>>> necessary > >>>>>>>>>>>>>>> infos. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> You can also get ceph-kvstore-tool from the 'ceph-tests' > >>>>>>>>>>>>>> package. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi sage, do you think we need to provided with upgrade > >>>>>>>>>>>>>> function > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>> fix > >>>>>>>>>>>>>>> it? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hmm, we might. This only affects the key/value encoding > >>>>>>>>>>>>>> right? > >>>>>>>>>>>>>> The > >>>>>>>>>>>>>> FileStore is using its own function to map these to file > >>>>>>>>>>>>>> names? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Can you open a ticket in the tracker for this? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks! > >>>>>>>>>>>>>> sage > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 7:36 PM, Kenneth Waegeman > >>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> > >>>>>>>>>>>>>>>> --------- > >>>>>>>>>>>>>>>> Date: Thu, 14 Aug 2014 19:11:55 +0800 > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com> > >>>>>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency? > >>>>>>>>>>>>>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Could you add config "debug_keyvaluestore = 20/20" to the > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> crashed > >>>>>>>>>>>>>>>>> osd > >>>>>>>>>>>>>>>>> and replay the command causing crash? > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I would like to get more debug infos! Thanks. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I included the log in attachment! > >>>>>>>>>>>>>>>> Thanks! > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 4:41 PM, Kenneth Waegeman > >>>>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I have: > >>>>>>>>>>>>>>>>>> osd_objectstore = keyvaluestore-dev > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> in the global section of my ceph.conf > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> [root at ceph002 ~]# ceph osd erasure-code-profile get > >>>>>>>>>>>>>>>>>> profile11 > >>>>>>>>>>>>>>>>>> directory=/usr/lib64/ceph/erasure-code > >>>>>>>>>>>>>>>>>> k=8 > >>>>>>>>>>>>>>>>>> m=3 > >>>>>>>>>>>>>>>>>> plugin=jerasure > >>>>>>>>>>>>>>>>>> ruleset-failure-domain=osd > >>>>>>>>>>>>>>>>>> technique=reed_sol_van > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> the ecdata pool has this as profile > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> pool 3 'ecdata' erasure size 11 min_size 8 > crush_ruleset 2 > >>>>>>>>>>>>>>>>>> object_hash > >>>>>>>>>>>>>>>>>> rjenkins pg_num 128 pgp_num 128 last_change 161 flags > >>>>>>>>>>>>>>>>>> hashpspool > >>>>>>>>>>>>>>>>>> stripe_width 4096 > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> ECrule in crushmap > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> rule ecdata { > >>>>>>>>>>>>>>>>>> ruleset 2 > >>>>>>>>>>>>>>>>>> type erasure > >>>>>>>>>>>>>>>>>> min_size 3 > >>>>>>>>>>>>>>>>>> max_size 20 > >>>>>>>>>>>>>>>>>> step set_chooseleaf_tries 5 > >>>>>>>>>>>>>>>>>> step take default-ec > >>>>>>>>>>>>>>>>>> step choose indep 0 type osd > >>>>>>>>>>>>>>>>>> step emit > >>>>>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>>>>> root default-ec { > >>>>>>>>>>>>>>>>>> id -8 # do not change unnecessarily > >>>>>>>>>>>>>>>>>> # weight 140.616 > >>>>>>>>>>>>>>>>>> alg straw > >>>>>>>>>>>>>>>>>> hash 0 # rjenkins1 > >>>>>>>>>>>>>>>>>> item ceph001-ec weight 46.872 > >>>>>>>>>>>>>>>>>> item ceph002-ec weight 46.872 > >>>>>>>>>>>>>>>>>> item ceph003-ec weight 46.872 > >>>>>>>>>>>>>>>>>> ... > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Cheers! > >>>>>>>>>>>>>>>>>> Kenneth > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> > >>>>>>>>>>>>>>>>>> --------- > >>>>>>>>>>>>>>>>>> Date: Thu, 14 Aug 2014 10:07:50 +0800 > >>>>>>>>>>>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com> > >>>>>>>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency? > >>>>>>>>>>>>>>>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> > >>>>>>>>>>>>>>>>>> Cc: ceph-users <ceph-users at lists.ceph.com> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hi Kenneth, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Could you give your configuration related to EC and > >>>>>>>>>>>>>>>>>>> KeyValueStore? > >>>>>>>>>>>>>>>>>>> Not sure whether it's bug on KeyValueStore > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 12:06 AM, Kenneth Waegeman > >>>>>>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I was doing some tests with rados bench on a Erasure > >>>>>>>>>>>>>>>>>>>> Coded > >>>>>>>>>>>>>>>>>>>> pool > >>>>>>>>>>>>>>>>>>>> (using > >>>>>>>>>>>>>>>>>>>> keyvaluestore-dev objectstore) on 0.83, and I see some > >>>>>>>>>>>>>>>>>>>> strangs > >>>>>>>>>>>>>>>>>>>> things: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> [root at ceph001 ~]# ceph status > >>>>>>>>>>>>>>>>>>>> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d > >>>>>>>>>>>>>>>>>>>> health HEALTH_WARN too few pgs per osd (4 < min 20) > >>>>>>>>>>>>>>>>>>>> monmap e1: 3 mons at > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> {ceph001= > 10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0, > >>>>>>>>>>>>>>>>>>>> ceph003=10.141.8.182:6789/0}, > >>>>>>>>>>>>>>>>>>>> election epoch 6, quorum 0,1,2 ceph001,ceph002,ceph003 > >>>>>>>>>>>>>>>>>>>> mdsmap e116: 1/1/1 up > >>>>>>>>>>>>>>>>>>>> {0=ceph001.cubone.os=up:active}, > >>>>>>>>>>>>>>>>>>>> 2 > >>>>>>>>>>>>>>>>>>>> up:standby > >>>>>>>>>>>>>>>>>>>> osdmap e292: 78 osds: 78 up, 78 in > >>>>>>>>>>>>>>>>>>>> pgmap v48873: 320 pgs, 4 pools, 15366 GB data, > 3841 > >>>>>>>>>>>>>>>>>>>> kobjects > >>>>>>>>>>>>>>>>>>>> 1381 GB used, 129 TB / 131 TB avail > >>>>>>>>>>>>>>>>>>>> 320 active+clean > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> There is around 15T of data, but only 1.3 T usage. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> This is also visible in rados: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> [root at ceph001 ~]# rados df > >>>>>>>>>>>>>>>>>>>> pool name category KB > objects > >>>>>>>>>>>>>>>>>>>> clones > >>>>>>>>>>>>>>>>>>>> degraded unfound rd rd KB > >>>>>>>>>>>>>>>>>>>> wr > >>>>>>>>>>>>>>>>>>>> wr > >>>>>>>>>>>>>>>>>>>> KB > >>>>>>>>>>>>>>>>>>>> data - 0 > >>>>>>>>>>>>>>>>>>>> 0 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 0 0 0 0 0 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> ecdata - 16113451009 > >>>>>>>>>>>>>>>>>>>> 3933959 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 0 0 1 1 3935632 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 16116850711 > >>>>>>>>>>>>>>>>>>>> metadata - 2 > >>>>>>>>>>>>>>>>>>>> 20 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 0 0 33 36 21 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 8 > >>>>>>>>>>>>>>>>>>>> rbd - 0 > >>>>>>>>>>>>>>>>>>>> 0 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 0 0 0 0 0 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> total used 1448266016 3933979 > >>>>>>>>>>>>>>>>>>>> total avail 139400181016 > >>>>>>>>>>>>>>>>>>>> total space 140848447032 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Another (related?) thing: if I do rados -p ecdata ls, > I > >>>>>>>>>>>>>>>>>>>> trigger > >>>>>>>>>>>>>>>>>>>> osd > >>>>>>>>>>>>>>>>>>>> shutdowns (each time): > >>>>>>>>>>>>>>>>>>>> I get a list followed by an error: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> ... > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_8961_object243839 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object801983 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object856489 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_8961_object202232 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_4919_object33199 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object807797 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_4919_object74729 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object1264121 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object1318513 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object1202111 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object939107 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object729682 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object122915 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object76521 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object113261 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object575079 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object671042 > >>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object381146 > >>>>>>>>>>>>>>>>>>>> 2014-08-13 17:57:48.736150 7f65047b5700 0 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1023295 >> > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/4471 pipe(0x7f64fc019b20 sd=5 :0 > s=1 > >>>>>>>>>>>>>>>>>>>> pgs=0 > >>>>>>>>>>>>>>>>>>>> cs=0 > >>>>>>>>>>>>>>>>>>>> l=1 > >>>>>>>>>>>>>>>>>>>> c=0x7f64fc019db0).fault > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> And I can see this in the log files: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -25> 2014-08-13 17:52:56.323908 7f8a97fa4700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 <== osd.57 > 10.141.8.182:0/15796 > >>>>>>>>>>>>>>>>>>>> 51 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) > v2 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> 47+0+0 > >>>>>>>>>>>>>>>>>>>> (3227325175 0 0) 0xf475940 con 0xee89fa0 > >>>>>>>>>>>>>>>>>>>> -24> 2014-08-13 17:52:56.323938 7f8a97fa4700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 --> 10.141.8.182:0/15796 -- > >>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply > >>>>>>>>>>>>>>>>>>>> e220 > >>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf815b00 > >>>>>>>>>>>>>>>>>>>> con > >>>>>>>>>>>>>>>>>>>> 0xee89fa0 > >>>>>>>>>>>>>>>>>>>> -23> 2014-08-13 17:52:56.324078 7f8a997a7700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 <== osd.57 > 10.141.8.182:0/15796 > >>>>>>>>>>>>>>>>>>>> 51 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) > v2 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> 47+0+0 > >>>>>>>>>>>>>>>>>>>> (3227325175 0 0) 0xf132bc0 con 0xee8a680 > >>>>>>>>>>>>>>>>>>>> -22> 2014-08-13 17:52:56.324111 7f8a997a7700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 --> 10.141.8.182:0/15796 -- > >>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply > >>>>>>>>>>>>>>>>>>>> e220 > >>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf811a40 > >>>>>>>>>>>>>>>>>>>> con > >>>>>>>>>>>>>>>>>>>> 0xee8a680 > >>>>>>>>>>>>>>>>>>>> -21> 2014-08-13 17:52:56.584461 7f8a997a7700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 <== osd.29 > 10.143.8.181:0/12142 > >>>>>>>>>>>>>>>>>>>> 47 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) > v2 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> 47+0+0 > >>>>>>>>>>>>>>>>>>>> (3355887204 0 0) 0xf655940 con 0xee88b00 > >>>>>>>>>>>>>>>>>>>> -20> 2014-08-13 17:52:56.584486 7f8a997a7700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 --> 10.143.8.181:0/12142 -- > >>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply > >>>>>>>>>>>>>>>>>>>> e220 > >>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf132bc0 > >>>>>>>>>>>>>>>>>>>> con > >>>>>>>>>>>>>>>>>>>> 0xee88b00 > >>>>>>>>>>>>>>>>>>>> -19> 2014-08-13 17:52:56.584498 7f8a97fa4700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 <== osd.29 > 10.143.8.181:0/12142 > >>>>>>>>>>>>>>>>>>>> 47 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) > v2 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> 47+0+0 > >>>>>>>>>>>>>>>>>>>> (3355887204 0 0) 0xf20e040 con 0xee886e0 > >>>>>>>>>>>>>>>>>>>> -18> 2014-08-13 17:52:56.584526 7f8a97fa4700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 --> 10.143.8.181:0/12142 -- > >>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply > >>>>>>>>>>>>>>>>>>>> e220 > >>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf475940 > >>>>>>>>>>>>>>>>>>>> con > >>>>>>>>>>>>>>>>>>>> 0xee886e0 > >>>>>>>>>>>>>>>>>>>> -17> 2014-08-13 17:52:56.594448 7f8a798c7700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 >> :/0 pipe(0xec15f00 sd=74 > >>>>>>>>>>>>>>>>>>>> :6839 > >>>>>>>>>>>>>>>>>>>> s=0 > >>>>>>>>>>>>>>>>>>>> pgs=0 > >>>>>>>>>>>>>>>>>>>> cs=0 > >>>>>>>>>>>>>>>>>>>> l=0 > >>>>>>>>>>>>>>>>>>>> c=0xee856a0).accept sd=74 10.141.8.180:47641/0 > >>>>>>>>>>>>>>>>>>>> -16> 2014-08-13 17:52:56.594921 7f8a798c7700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 <== client.7512 > >>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1018433 > >>>>>>>>>>>>>>>>>>>> 1 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+39 > >>>>>>>>>>>>>>>>>>>> (1972163119 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 4174233976) 0xf3bca40 con 0xee856a0 > >>>>>>>>>>>>>>>>>>>> -15> 2014-08-13 17:52:56.594957 7f8a798c7700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594874, event: > >>>>>>>>>>>>>>>>>>>> header_read, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -14> 2014-08-13 17:52:56.594970 7f8a798c7700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594880, event: > throttled, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -13> 2014-08-13 17:52:56.594978 7f8a798c7700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594917, event: > all_read, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -12> 2014-08-13 17:52:56.594986 7f8a798c7700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 299, time: 0.000000, event: dispatched, op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1 > >>>>>>>>>>>>>>>>>>>> [pgls > >>>>>>>>>>>>>>>>>>>> start_epoch 0] 3.0 ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -11> 2014-08-13 17:52:56.595127 7f8a90795700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.595104, event: > >>>>>>>>>>>>>>>>>>>> reached_pg, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -10> 2014-08-13 17:52:56.595159 7f8a90795700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.595153, event: started, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -9> 2014-08-13 17:52:56.602179 7f8a90795700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 --> 10.141.8.180:0/1018433 -- > >>>>>>>>>>>>>>>>>>>> osd_op_reply(1 > >>>>>>>>>>>>>>>>>>>> [pgls > >>>>>>>>>>>>>>>>>>>> start_epoch 0] v164'30654 uv30654 ondisk = 0) v6 -- > ?+0 > >>>>>>>>>>>>>>>>>>>> 0xec16180 > >>>>>>>>>>>>>>>>>>>> con > >>>>>>>>>>>>>>>>>>>> 0xee856a0 > >>>>>>>>>>>>>>>>>>>> -8> 2014-08-13 17:52:56.602211 7f8a90795700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.602205, event: done, > op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -7> 2014-08-13 17:52:56.614839 7f8a798c7700 1 -- > >>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 <== client.7512 > >>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1018433 > >>>>>>>>>>>>>>>>>>>> 2 > >>>>>>>>>>>>>>>>>>>> ==== > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+89 > >>>>>>>>>>>>>>>>>>>> (3460833343 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 2600845095) 0xf3bcec0 con 0xee856a0 > >>>>>>>>>>>>>>>>>>>> -6> 2014-08-13 17:52:56.614864 7f8a798c7700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614789, event: > >>>>>>>>>>>>>>>>>>>> header_read, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -5> 2014-08-13 17:52:56.614874 7f8a798c7700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614792, event: > throttled, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -4> 2014-08-13 17:52:56.614884 7f8a798c7700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614835, event: > all_read, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -3> 2014-08-13 17:52:56.614891 7f8a798c7700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 300, time: 0.000000, event: dispatched, op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2 > >>>>>>>>>>>>>>>>>>>> [pgls > >>>>>>>>>>>>>>>>>>>> start_epoch 220] 3.0 ack+read+known_if_redirected > e220) > >>>>>>>>>>>>>>>>>>>> -2> 2014-08-13 17:52:56.614972 7f8a92f9a700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614958, event: > >>>>>>>>>>>>>>>>>>>> reached_pg, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> -1> 2014-08-13 17:52:56.614993 7f8a92f9a700 5 -- op > >>>>>>>>>>>>>>>>>>>> tracker > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> , > >>>>>>>>>>>>>>>>>>>> seq: > >>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614986, event: started, > >>>>>>>>>>>>>>>>>>>> op: > >>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 > >>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) > >>>>>>>>>>>>>>>>>>>> 0> 2014-08-13 17:52:56.617087 7f8a92f9a700 -1 > >>>>>>>>>>>>>>>>>>>> os/GenericObjectMap.cc: > >>>>>>>>>>>>>>>>>>>> In function 'int GenericObjectMap::list_objects(const > >>>>>>>>>>>>>>>>>>>> coll_t&, > >>>>>>>>>>>>>>>>>>>> ghobject_t, > >>>>>>>>>>>>>>>>>>>> int, std::vector<ghobject_t>*, ghobject_t*)' thread > >>>>>>>>>>>>>>>>>>>> 7f8a92f9a700 > >>>>>>>>>>>>>>>>>>>> time > >>>>>>>>>>>>>>>>>>>> 2014-08-13 17:52:56.615073 > >>>>>>>>>>>>>>>>>>>> os/GenericObjectMap.cc: 1118: FAILED assert(start <= > >>>>>>>>>>>>>>>>>>>> header.oid) > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385 > >>>>>>>>>>>>>>>>>>>> 64c36c92b8) > >>>>>>>>>>>>>>>>>>>> 1: (GenericObjectMap::list_objects(coll_t const&, > >>>>>>>>>>>>>>>>>>>> ghobject_t, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, > >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474) > >>>>>>>>>>>>>>>>>>>> [0x98f774] > >>>>>>>>>>>>>>>>>>>> 2: (KeyValueStore::collection_list_partial(coll_t, > >>>>>>>>>>>>>>>>>>>> ghobject_t, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t, > >>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> *, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54] > >>>>>>>>>>>>>>>>>>>> 3: (PGBackend::objects_list_partial(hobject_t const&, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> snapid_t, > >>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, > >>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9) > >>>>>>>>>>>>>>>>>>>> [0x862de9] > >>>>>>>>>>>>>>>>>>>> 4: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+ > >>>>>>>>>>>>>>>>>>>> 0xea5) > >>>>>>>>>>>>>>>>>>>> [0x7f67f5] > >>>>>>>>>>>>>>>>>>>> 5: > >>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1: > >>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3) > >>>>>>>>>>>>>>>>>>>> [0x8177b3] > >>>>>>>>>>>>>>>>>>>> 6: (ReplicatedPG::do_request(std: > >>>>>>>>>>>>>>>>>>>> :tr1::shared_ptr<OpRequest>, > >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] > >>>>>>>>>>>>>>>>>>>> 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > >>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>, > >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d) > >>>>>>>>>>>>>>>>>>>> [0x62bf8d] > >>>>>>>>>>>>>>>>>>>> 8: (OSD::ShardedOpWQ::_process(unsigned int, > >>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] > >>>>>>>>>>>>>>>>>>>> 9: > (ShardedThreadPool::shardedthreadpool_worker(unsigned > >>>>>>>>>>>>>>>>>>>> int)+0x8cd) > >>>>>>>>>>>>>>>>>>>> [0xa776fd] > >>>>>>>>>>>>>>>>>>>> 10: > (ShardedThreadPool::WorkThreadSharded::entry()+0x10) > >>>>>>>>>>>>>>>>>>>> [0xa79980] > >>>>>>>>>>>>>>>>>>>> 11: (()+0x7df3) [0x7f8aac71fdf3] > >>>>>>>>>>>>>>>>>>>> 12: (clone()+0x6d) [0x7f8aab1963dd] > >>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS > >>>>>>>>>>>>>>>>>>>> <executable>` > >>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>> needed > >>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> interpret this. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385 > >>>>>>>>>>>>>>>>>>>> 64c36c92b8) > >>>>>>>>>>>>>>>>>>>> 1: /usr/bin/ceph-osd() [0x99b466] > >>>>>>>>>>>>>>>>>>>> 2: (()+0xf130) [0x7f8aac727130] > >>>>>>>>>>>>>>>>>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989] > >>>>>>>>>>>>>>>>>>>> 4: (abort()+0x148) [0x7f8aab0d7098] > >>>>>>>>>>>>>>>>>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) > >>>>>>>>>>>>>>>>>>>> [0x7f8aab9e89d5] > >>>>>>>>>>>>>>>>>>>> 6: (()+0x5e946) [0x7f8aab9e6946] > >>>>>>>>>>>>>>>>>>>> 7: (()+0x5e973) [0x7f8aab9e6973] > >>>>>>>>>>>>>>>>>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f] > >>>>>>>>>>>>>>>>>>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> char > >>>>>>>>>>>>>>>>>>>> const*)+0x1ef) [0xa8805f] > >>>>>>>>>>>>>>>>>>>> 10: (GenericObjectMap::list_objects(coll_t const&, > >>>>>>>>>>>>>>>>>>>> ghobject_t, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, > >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474) > >>>>>>>>>>>>>>>>>>>> [0x98f774] > >>>>>>>>>>>>>>>>>>>> 11: (KeyValueStore::collection_list_partial(coll_t, > >>>>>>>>>>>>>>>>>>>> ghobject_t, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t, > >>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> *, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54] > >>>>>>>>>>>>>>>>>>>> 12: (PGBackend::objects_list_partial(hobject_t const&, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> snapid_t, > >>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, > >>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9) > >>>>>>>>>>>>>>>>>>>> [0x862de9] > >>>>>>>>>>>>>>>>>>>> 13: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+ > >>>>>>>>>>>>>>>>>>>> 0xea5) > >>>>>>>>>>>>>>>>>>>> [0x7f67f5] > >>>>>>>>>>>>>>>>>>>> 14: > >>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1: > >>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3) > >>>>>>>>>>>>>>>>>>>> [0x8177b3] > >>>>>>>>>>>>>>>>>>>> 15: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, > >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] > >>>>>>>>>>>>>>>>>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > >>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>, > >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d) > >>>>>>>>>>>>>>>>>>>> [0x62bf8d] > >>>>>>>>>>>>>>>>>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int, > >>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] > >>>>>>>>>>>>>>>>>>>> 18: > >>>>>>>>>>>>>>>>>>>> (ShardedThreadPool::shardedthreadpool_worker(unsigned > >>>>>>>>>>>>>>>>>>>> int)+0x8cd) > >>>>>>>>>>>>>>>>>>>> [0xa776fd] > >>>>>>>>>>>>>>>>>>>> 19: > (ShardedThreadPool::WorkThreadSharded::entry()+0x10) > >>>>>>>>>>>>>>>>>>>> [0xa79980] > >>>>>>>>>>>>>>>>>>>> 20: (()+0x7df3) [0x7f8aac71fdf3] > >>>>>>>>>>>>>>>>>>>> 21: (clone()+0x6d) [0x7f8aab1963dd] > >>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS > >>>>>>>>>>>>>>>>>>>> <executable>` > >>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>> needed > >>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> interpret this. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> --- begin dump of recent events --- > >>>>>>>>>>>>>>>>>>>> 0> 2014-08-13 17:52:56.714214 7f8a92f9a700 -1 *** > >>>>>>>>>>>>>>>>>>>> Caught > >>>>>>>>>>>>>>>>>>>> signal > >>>>>>>>>>>>>>>>>>>> (Aborted) ** > >>>>>>>>>>>>>>>>>>>> in thread 7f8a92f9a700 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385 > >>>>>>>>>>>>>>>>>>>> 64c36c92b8) > >>>>>>>>>>>>>>>>>>>> 1: /usr/bin/ceph-osd() [0x99b466] > >>>>>>>>>>>>>>>>>>>> 2: (()+0xf130) [0x7f8aac727130] > >>>>>>>>>>>>>>>>>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989] > >>>>>>>>>>>>>>>>>>>> 4: (abort()+0x148) [0x7f8aab0d7098] > >>>>>>>>>>>>>>>>>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) > >>>>>>>>>>>>>>>>>>>> [0x7f8aab9e89d5] > >>>>>>>>>>>>>>>>>>>> 6: (()+0x5e946) [0x7f8aab9e6946] > >>>>>>>>>>>>>>>>>>>> 7: (()+0x5e973) [0x7f8aab9e6973] > >>>>>>>>>>>>>>>>>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f] > >>>>>>>>>>>>>>>>>>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> char > >>>>>>>>>>>>>>>>>>>> const*)+0x1ef) [0xa8805f] > >>>>>>>>>>>>>>>>>>>> 10: (GenericObjectMap::list_objects(coll_t const&, > >>>>>>>>>>>>>>>>>>>> ghobject_t, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, > >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474) > >>>>>>>>>>>>>>>>>>>> [0x98f774] > >>>>>>>>>>>>>>>>>>>> 11: (KeyValueStore::collection_list_partial(coll_t, > >>>>>>>>>>>>>>>>>>>> ghobject_t, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t, > >>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> *, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54] > >>>>>>>>>>>>>>>>>>>> 12: (PGBackend::objects_list_partial(hobject_t const&, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> int, > >>>>>>>>>>>>>>>>>>>> snapid_t, > >>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, > >>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9) > >>>>>>>>>>>>>>>>>>>> [0x862de9] > >>>>>>>>>>>>>>>>>>>> 13: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+ > >>>>>>>>>>>>>>>>>>>> 0xea5) > >>>>>>>>>>>>>>>>>>>> [0x7f67f5] > >>>>>>>>>>>>>>>>>>>> 14: > >>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1: > >>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3) > >>>>>>>>>>>>>>>>>>>> [0x8177b3] > >>>>>>>>>>>>>>>>>>>> 15: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, > >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] > >>>>>>>>>>>>>>>>>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > >>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>, > >>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d) > >>>>>>>>>>>>>>>>>>>> [0x62bf8d] > >>>>>>>>>>>>>>>>>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int, > >>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] > >>>>>>>>>>>>>>>>>>>> 18: > >>>>>>>>>>>>>>>>>>>> (ShardedThreadPool::shardedthreadpool_worker(unsigned > >>>>>>>>>>>>>>>>>>>> int)+0x8cd) > >>>>>>>>>>>>>>>>>>>> [0xa776fd] > >>>>>>>>>>>>>>>>>>>> 19: > (ShardedThreadPool::WorkThreadSharded::entry()+0x10) > >>>>>>>>>>>>>>>>>>>> [0xa79980] > >>>>>>>>>>>>>>>>>>>> 20: (()+0x7df3) [0x7f8aac71fdf3] > >>>>>>>>>>>>>>>>>>>> 21: (clone()+0x6d) [0x7f8aab1963dd] > >>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS > >>>>>>>>>>>>>>>>>>>> <executable>` > >>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>> needed > >>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> interpret this. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I guess this has something to do with using the dev > >>>>>>>>>>>>>>>>>>>> Keyvaluestore? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Thanks! > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Kenneth > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>>>>>>>>> ceph-users mailing list > >>>>>>>>>>>>>>>>>>>> ceph-users at lists.ceph.com > >>>>>>>>>>>>>>>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Wheat > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> ----- End message from Haomai Wang < > haomaiwang at gmail.com> > >>>>>>>>>>>>>>>>>> ----- > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Met vriendelijke groeten, > >>>>>>>>>>>>>>>>>> Kenneth Waegeman > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Wheat > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> > >>>>>>>>>>>>>>>> ----- > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Met vriendelijke groeten, > >>>>>>>>>>>>>>>> Kenneth Waegeman > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Wheat > >>>>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>>>> ceph-users mailing list > >>>>>>>>>>>>>>> ceph-users at lists.ceph.com > >>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> ----- End message from Sage Weil <sweil at redhat.com> ----- > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> > >>>>>>>>>>>>> Met vriendelijke groeten, > > > > > >>>>>>>>>>>>> Kenneth Waegeman > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -- > >>>>>>>>>>>> Best Regards, > >>>>>>>>>>>> > >>>>>>>>>>>> Wheat > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> > ----- > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> > >>>>>>>>>>> Met vriendelijke groeten, > >>>>>>>>>>> Kenneth Waegeman > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Best Regards, > >>>>>>>>>> > >>>>>>>>>> Wheat > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> > >>>>>>>>> Met vriendelijke groeten, > >>>>>>>>> Kenneth Waegeman > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Best Regards, > >>>>>>>> > >>>>>>>> Wheat > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Best Regards, > >>>>>>> > >>>>>>> Wheat > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- > >>>>>> > >>>>>> -- > >>>>>> > >>>>>> Met vriendelijke groeten, > >>>>>> Kenneth Waegeman > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Best Regards, > >>>>> > >>>>> Wheat > >>>>> > >>>> > >>>> > >>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- > >>>> > >>>> -- > >>>> > >>>> Met vriendelijke groeten, > >>>> Kenneth Waegeman > >>>> > >>>> > >>>> > >>> > >>> > >>> -- > >>> > >>> Best Regards, > >>> > >>> Wheat > >> > >> > >> > >> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- > >> > >> -- > >> > >> Met vriendelijke groeten, > >> Kenneth Waegeman > > > > > > > > ----- End message from Kenneth Waegeman <Kenneth.Waegeman at UGent.be> > ----- > > > > > > -- > > > > Met vriendelijke groeten, > > Kenneth Waegeman > > > > > > -- > Best Regards, > > Wheat > -- Best Regards, Wheat -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140907/635df5a7/attachment-0001.htm>