ceph cluster inconsistency keyvaluestore

Kenneth.Waegeman@xxxxxxxx (Kenneth Waegeman) · Wed, 03 Sep 2014 15:16:35 +0200



I also can reproduce it on a new slightly different set up (also EC on  
KV and Cache) by running ceph pg scrub on a KV pg: this pg will then  
get the 'inconsistent' status


----- Message from Kenneth Waegeman <Kenneth.Waegeman at UGent.be> ---------
    Date: Mon, 01 Sep 2014 16:28:31 +0200
    From: Kenneth Waegeman <Kenneth.Waegeman at UGent.be>
Subject: Re: ceph cluster inconsistency keyvaluestore
      To: Haomai Wang <haomaiwang at gmail.com>
      Cc: ceph-users at lists.ceph.com


> Hi,
>
>
> The cluster got installed with quattor, which uses ceph-deploy for  
> installation of daemons, writes the config file and installs the  
> crushmap.
> I have 3 hosts, each 12 disks, having a large KV partition (3.6T)  
> for the ECdata pool and a small cache partition (50G) for the cache
>
> I manually did this:
>
> ceph osd pool create cache 1024 1024
> ceph osd pool set cache size 2
> ceph osd pool set cache min_size 1
> ceph osd erasure-code-profile set profile11 k=8 m=3  
> ruleset-failure-domain=osd
> ceph osd pool create ecdata 128 128 erasure profile11
> ceph osd tier add ecdata cache
> ceph osd tier cache-mode cache writeback
> ceph osd tier set-overlay ecdata cache
> ceph osd pool set cache hit_set_type bloom
> ceph osd pool set cache hit_set_count 1
> ceph osd pool set cache hit_set_period 3600
> ceph osd pool set cache target_max_bytes $((280*1024*1024*1024))
>
> (But the previous time I had the problem already without the cache part)
>
>
>
> Cluster live since 2014-08-29 15:34:16
>
> Config file on host ceph001:
>
> [global]
> auth_client_required = cephx
> auth_cluster_required = cephx
> auth_service_required = cephx
> cluster_network = 10.143.8.0/24
> filestore_xattr_use_omap = 1
> fsid = 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
> mon_cluster_log_to_syslog = 1
> mon_host = ceph001.cubone.os, ceph002.cubone.os, ceph003.cubone.os
> mon_initial_members = ceph001, ceph002, ceph003
> osd_crush_update_on_start = 0
> osd_journal_size = 10240
> osd_pool_default_min_size = 2
> osd_pool_default_pg_num = 512
> osd_pool_default_pgp_num = 512
> osd_pool_default_size = 3
> public_network = 10.141.8.0/24
>
> [osd.11]
> osd_objectstore = keyvaluestore-dev
>
> [osd.13]
> osd_objectstore = keyvaluestore-dev
>
> [osd.15]
> osd_objectstore = keyvaluestore-dev
>
> [osd.17]
> osd_objectstore = keyvaluestore-dev
>
> [osd.19]
> osd_objectstore = keyvaluestore-dev
>
> [osd.21]
> osd_objectstore = keyvaluestore-dev
>
> [osd.23]
> osd_objectstore = keyvaluestore-dev
>
> [osd.25]
> osd_objectstore = keyvaluestore-dev
>
> [osd.3]
> osd_objectstore = keyvaluestore-dev
>
> [osd.5]
> osd_objectstore = keyvaluestore-dev
>
> [osd.7]
> osd_objectstore = keyvaluestore-dev
>
> [osd.9]
> osd_objectstore = keyvaluestore-dev
>
>
> OSDs:
> # id	weight	type name	up/down	reweight
> -12	140.6	root default-cache
> -9	46.87		host ceph001-cache
> 2	3.906			osd.2	up	1
> 4	3.906			osd.4	up	1
> 6	3.906			osd.6	up	1
> 8	3.906			osd.8	up	1
> 10	3.906			osd.10	up	1
> 12	3.906			osd.12	up	1
> 14	3.906			osd.14	up	1
> 16	3.906			osd.16	up	1
> 18	3.906			osd.18	up	1
> 20	3.906			osd.20	up	1
> 22	3.906			osd.22	up	1
> 24	3.906			osd.24	up	1
> -10	46.87		host ceph002-cache
> 28	3.906			osd.28	up	1
> 30	3.906			osd.30	up	1
> 32	3.906			osd.32	up	1
> 34	3.906			osd.34	up	1
> 36	3.906			osd.36	up	1
> 38	3.906			osd.38	up	1
> 40	3.906			osd.40	up	1
> 42	3.906			osd.42	up	1
> 44	3.906			osd.44	up	1
> 46	3.906			osd.46	up	1
> 48	3.906			osd.48	up	1
> 50	3.906			osd.50	up	1
> -11	46.87		host ceph003-cache
> 54	3.906			osd.54	up	1
> 56	3.906			osd.56	up	1
> 58	3.906			osd.58	up	1
> 60	3.906			osd.60	up	1
> 62	3.906			osd.62	up	1
> 64	3.906			osd.64	up	1
> 66	3.906			osd.66	up	1
> 68	3.906			osd.68	up	1
> 70	3.906			osd.70	up	1
> 72	3.906			osd.72	up	1
> 74	3.906			osd.74	up	1
> 76	3.906			osd.76	up	1
> -8	140.6	root default-ec
> -5	46.87		host ceph001-ec
> 3	3.906			osd.3	up	1
> 5	3.906			osd.5	up	1
> 7	3.906			osd.7	up	1
> 9	3.906			osd.9	up	1
> 11	3.906			osd.11	up	1
> 13	3.906			osd.13	up	1
> 15	3.906			osd.15	up	1
> 17	3.906			osd.17	up	1
> 19	3.906			osd.19	up	1
> 21	3.906			osd.21	up	1
> 23	3.906			osd.23	up	1
> 25	3.906			osd.25	up	1
> -6	46.87		host ceph002-ec
> 29	3.906			osd.29	up	1
> 31	3.906			osd.31	up	1
> 33	3.906			osd.33	up	1
> 35	3.906			osd.35	up	1
> 37	3.906			osd.37	up	1
> 39	3.906			osd.39	up	1
> 41	3.906			osd.41	up	1
> 43	3.906			osd.43	up	1
> 45	3.906			osd.45	up	1
> 47	3.906			osd.47	up	1
> 49	3.906			osd.49	up	1
> 51	3.906			osd.51	up	1
> -7	46.87		host ceph003-ec
> 55	3.906			osd.55	up	1
> 57	3.906			osd.57	up	1
> 59	3.906			osd.59	up	1
> 61	3.906			osd.61	up	1
> 63	3.906			osd.63	up	1
> 65	3.906			osd.65	up	1
> 67	3.906			osd.67	up	1
> 69	3.906			osd.69	up	1
> 71	3.906			osd.71	up	1
> 73	3.906			osd.73	up	1
> 75	3.906			osd.75	up	1
> 77	3.906			osd.77	up	1
> -4	23.44	root default-ssd
> -1	7.812		host ceph001-ssd
> 0	3.906			osd.0	up	1
> 1	3.906			osd.1	up	1
> -2	7.812		host ceph002-ssd
> 26	3.906			osd.26	up	1
> 27	3.906			osd.27	up	1
> -3	7.812		host ceph003-ssd
> 52	3.906			osd.52	up	1
> 53	3.906			osd.53	up	1
>
> Cache OSDs are each 50G, the EC KV OSDS 3.6T, (ssds not used right now)
>
> Pools:
> pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0  
> object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags  
> hashpspool stripe_width 0
> pool 1 'cache' replicated size 2 min_size 1 crush_ruleset 0  
> object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 174 flags  
> hashpspool,incomplete_clones tier_of 2 cache_mode writeback  
> target_bytes 300647710720 hit_set bloom{false_positive_probability:  
> 0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0
> pool 2 'ecdata' erasure size 11 min_size 8 crush_ruleset 2  
> object_hash rjenkins pg_num 128 pgp_num 128 last_change 170 lfor 170  
> flags hashpspool tiers 1 read_tier 1 write_tier 1 stripe_width 4096
>
>
> Crushmap:
> # begin crush map
> tunable choose_local_fallback_tries 0
> tunable choose_local_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 osd.4
> device 5 osd.5
> device 6 osd.6
> device 7 osd.7
> device 8 osd.8
> device 9 osd.9
> device 10 osd.10
> device 11 osd.11
> device 12 osd.12
> device 13 osd.13
> device 14 osd.14
> device 15 osd.15
> device 16 osd.16
> device 17 osd.17
> device 18 osd.18
> device 19 osd.19
> device 20 osd.20
> device 21 osd.21
> device 22 osd.22
> device 23 osd.23
> device 24 osd.24
> device 25 osd.25
> device 26 osd.26
> device 27 osd.27
> device 28 osd.28
> device 29 osd.29
> device 30 osd.30
> device 31 osd.31
> device 32 osd.32
> device 33 osd.33
> device 34 osd.34
> device 35 osd.35
> device 36 osd.36
> device 37 osd.37
> device 38 osd.38
> device 39 osd.39
> device 40 osd.40
> device 41 osd.41
> device 42 osd.42
> device 43 osd.43
> device 44 osd.44
> device 45 osd.45
> device 46 osd.46
> device 47 osd.47
> device 48 osd.48
> device 49 osd.49
> device 50 osd.50
> device 51 osd.51
> device 52 osd.52
> device 53 osd.53
> device 54 osd.54
> device 55 osd.55
> device 56 osd.56
> device 57 osd.57
> device 58 osd.58
> device 59 osd.59
> device 60 osd.60
> device 61 osd.61
> device 62 osd.62
> device 63 osd.63
> device 64 osd.64
> device 65 osd.65
> device 66 osd.66
> device 67 osd.67
> device 68 osd.68
> device 69 osd.69
> device 70 osd.70
> device 71 osd.71
> device 72 osd.72
> device 73 osd.73
> device 74 osd.74
> device 75 osd.75
> device 76 osd.76
> device 77 osd.77
>
> # types
> type 0 osd
> type 1 host
> type 2 root
>
> # buckets
> host ceph001-ssd {
> 	id -1		# do not change unnecessarily
> 	# weight 7.812
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.0 weight 3.906
> 	item osd.1 weight 3.906
> }
> host ceph002-ssd {
> 	id -2		# do not change unnecessarily
> 	# weight 7.812
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.26 weight 3.906
> 	item osd.27 weight 3.906
> }
> host ceph003-ssd {
> 	id -3		# do not change unnecessarily
> 	# weight 7.812
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.52 weight 3.906
> 	item osd.53 weight 3.906
> }
> root default-ssd {
> 	id -4		# do not change unnecessarily
> 	# weight 23.436
> 	alg straw
> 	hash 0	# rjenkins1
> 	item ceph001-ssd weight 7.812
> 	item ceph002-ssd weight 7.812
> 	item ceph003-ssd weight 7.812
> }
> host ceph001-ec {
> 	id -5		# do not change unnecessarily
> 	# weight 46.872
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.3 weight 3.906
> 	item osd.5 weight 3.906
> 	item osd.7 weight 3.906
> 	item osd.9 weight 3.906
> 	item osd.11 weight 3.906
> 	item osd.13 weight 3.906
> 	item osd.15 weight 3.906
> 	item osd.17 weight 3.906
> 	item osd.19 weight 3.906
> 	item osd.21 weight 3.906
> 	item osd.23 weight 3.906
> 	item osd.25 weight 3.906
> }
> host ceph002-ec {
> 	id -6		# do not change unnecessarily
> 	# weight 46.872
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.29 weight 3.906
> 	item osd.31 weight 3.906
> 	item osd.33 weight 3.906
> 	item osd.35 weight 3.906
> 	item osd.37 weight 3.906
> 	item osd.39 weight 3.906
> 	item osd.41 weight 3.906
> 	item osd.43 weight 3.906
> 	item osd.45 weight 3.906
> 	item osd.47 weight 3.906
> 	item osd.49 weight 3.906
> 	item osd.51 weight 3.906
> }
> host ceph003-ec {
> 	id -7		# do not change unnecessarily
> 	# weight 46.872
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.55 weight 3.906
> 	item osd.57 weight 3.906
> 	item osd.59 weight 3.906
> 	item osd.61 weight 3.906
> 	item osd.63 weight 3.906
> 	item osd.65 weight 3.906
> 	item osd.67 weight 3.906
> 	item osd.69 weight 3.906
> 	item osd.71 weight 3.906
> 	item osd.73 weight 3.906
> 	item osd.75 weight 3.906
> 	item osd.77 weight 3.906
> }
> root default-ec {
> 	id -8		# do not change unnecessarily
> 	# weight 140.616
> 	alg straw
> 	hash 0	# rjenkins1
> 	item ceph001-ec weight 46.872
> 	item ceph002-ec weight 46.872
> 	item ceph003-ec weight 46.872
> }
> host ceph001-cache {
> 	id -9		# do not change unnecessarily
> 	# weight 46.872
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.2 weight 3.906
> 	item osd.4 weight 3.906
> 	item osd.6 weight 3.906
> 	item osd.8 weight 3.906
> 	item osd.10 weight 3.906
> 	item osd.12 weight 3.906
> 	item osd.14 weight 3.906
> 	item osd.16 weight 3.906
> 	item osd.18 weight 3.906
> 	item osd.20 weight 3.906
> 	item osd.22 weight 3.906
> 	item osd.24 weight 3.906
> }
> host ceph002-cache {
> 	id -10		# do not change unnecessarily
> 	# weight 46.872
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.28 weight 3.906
> 	item osd.30 weight 3.906
> 	item osd.32 weight 3.906
> 	item osd.34 weight 3.906
> 	item osd.36 weight 3.906
> 	item osd.38 weight 3.906
> 	item osd.40 weight 3.906
> 	item osd.42 weight 3.906
> 	item osd.44 weight 3.906
> 	item osd.46 weight 3.906
> 	item osd.48 weight 3.906
> 	item osd.50 weight 3.906
> }
> host ceph003-cache {
> 	id -11		# do not change unnecessarily
> 	# weight 46.872
> 	alg straw
> 	hash 0	# rjenkins1
> 	item osd.54 weight 3.906
> 	item osd.56 weight 3.906
> 	item osd.58 weight 3.906
> 	item osd.60 weight 3.906
> 	item osd.62 weight 3.906
> 	item osd.64 weight 3.906
> 	item osd.66 weight 3.906
> 	item osd.68 weight 3.906
> 	item osd.70 weight 3.906
> 	item osd.72 weight 3.906
> 	item osd.74 weight 3.906
> 	item osd.76 weight 3.906
> }
> root default-cache {
> 	id -12		# do not change unnecessarily
> 	# weight 140.616
> 	alg straw
> 	hash 0	# rjenkins1
> 	item ceph001-cache weight 46.872
> 	item ceph002-cache weight 46.872
> 	item ceph003-cache weight 46.872
> }
>
> # rules
> rule cache {
> 	ruleset 0
> 	type replicated
> 	min_size 1
> 	max_size 10
> 	step take default-cache
> 	step chooseleaf firstn 0 type host
> 	step emit
> }
> rule metadata {
> 	ruleset 1
> 	type replicated
> 	min_size 1
> 	max_size 10
> 	step take default-ssd
> 	step chooseleaf firstn 0 type host
> 	step emit
> }
> rule ecdata {
> 	ruleset 2
> 	type erasure
> 	min_size 3
> 	max_size 20
> 	step set_chooseleaf_tries 5
> 	step take default-ec
> 	step choose indep 0 type osd
> 	step emit
> }
>
> # end crush map
>
> The benchmarks I then did:
>
> ./benchrw 50000
>
> benchrw:
> /usr/bin/rados -p ecdata bench $1 write --no-cleanup
> /usr/bin/rados -p ecdata bench $1 seq
> /usr/bin/rados -p ecdata bench $1 seq &
> /usr/bin/rados -p ecdata bench $1 write --no-cleanup
>
>
> Srubbing errors started soon after that: 2014-08-31 10:59:14
>
>
> Please let me know if you need more information, and thanks !
>
> Kenneth
>
> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>    Date: Mon, 1 Sep 2014 21:30:16 +0800
>    From: Haomai Wang <haomaiwang at gmail.com>
> Subject: Re: ceph cluster inconsistency keyvaluestore
>      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>      Cc: ceph-users at lists.ceph.com
>
>
>> Hmm, could you please list your instructions including cluster existing
>> time and all relevant ops? I want to reproduce it.
>>
>>
>> On Mon, Sep 1, 2014 at 4:45 PM, Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>> wrote:
>>
>>> Hi,
>>>
>>> I reinstalled the cluster with 0.84, and tried again running rados bench
>>> on a EC coded pool on keyvaluestore.
>>> Nothing crashed this time, but when I check the status:
>>>
>>>     health HEALTH_ERR 128 pgs inconsistent; 128 scrub errors; too few pgs
>>> per osd (15 < min 20)
>>>     monmap e1: 3 mons at {ceph001=10.141.8.180:6789/0,
>>> ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, election epoch
>>> 8, quorum 0,1,2 ceph001,ceph002,ceph003
>>>     osdmap e174: 78 osds: 78 up, 78 in
>>>      pgmap v147680: 1216 pgs, 3 pools, 14758 GB data, 3690 kobjects
>>>            1753 GB used, 129 TB / 131 TB avail
>>>                1088 active+clean
>>>                 128 active+clean+inconsistent
>>>
>>> the 128 inconsistent pgs are ALL the pgs of the EC KV store ( the others
>>> are on Filestore)
>>>
>>> The only thing I can see in the logs is that after the rados tests, it
>>> start scrubbing, and for each KV pg I get something like this:
>>>
>>> 2014-08-31 11:14:09.050747 osd.11 10.141.8.180:6833/61098 4 : [ERR] 2.3s0
>>> scrub stat mismatch, got 28164/29291 objects, 0/0 clones, 28164/29291
>>> dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts,
>>> 118128377856/122855358464 bytes.
>>>
>>> What could here be the problem?
>>> Thanks again!!
>>>
>>> Kenneth
>>>
>>>
>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>>   Date: Tue, 26 Aug 2014 17:11:43 +0800
>>>   From: Haomai Wang <haomaiwang at gmail.com>
>>> Subject: Re: ceph cluster inconsistency?
>>>     To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>>     Cc: ceph-users at lists.ceph.com
>>>
>>>
>>> Hmm, it looks like you hit this bug(http://tracker.ceph.com/issues/9223).
>>>>
>>>> Sorry for the late message, I forget that this fix is merged into 0.84.
>>>>
>>>> Thanks for your patient :-)
>>>>
>>>> On Tue, Aug 26, 2014 at 4:39 PM, Kenneth Waegeman
>>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> In the meantime I already tried with upgrading the cluster to 0.84, to
>>>>> see
>>>>> if that made a difference, and it seems it does.
>>>>> I can't reproduce the crashing osds by doing a 'rados -p ecdata ls'
>>>>> anymore.
>>>>>
>>>>> But now the cluster detect it is inconsistent:
>>>>>
>>>>>      cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>>>>>       health HEALTH_ERR 40 pgs inconsistent; 40 scrub errors; too few
>>>>> pgs
>>>>> per osd (4 < min 20); mon.ceph002 low disk space
>>>>>       monmap e3: 3 mons at
>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,
>>>>> ceph003=10.141.8.182:6789/0},
>>>>> election epoch 30, quorum 0,1,2 ceph001,ceph002,ceph003
>>>>>       mdsmap e78951: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3
>>>>> up:standby
>>>>>       osdmap e145384: 78 osds: 78 up, 78 in
>>>>>        pgmap v247095: 320 pgs, 4 pools, 15366 GB data, 3841 kobjects
>>>>>              1502 GB used, 129 TB / 131 TB avail
>>>>>                   279 active+clean
>>>>>                    40 active+clean+inconsistent
>>>>>                     1 active+clean+scrubbing+deep
>>>>>
>>>>>
>>>>> I tried to do ceph pg repair for all the inconsistent pgs:
>>>>>
>>>>>      cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>>>>>       health HEALTH_ERR 40 pgs inconsistent; 1 pgs repair; 40 scrub
>>>>> errors;
>>>>> too few pgs per osd (4 < min 20); mon.ceph002 low disk space
>>>>>       monmap e3: 3 mons at
>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,
>>>>> ceph003=10.141.8.182:6789/0},
>>>>> election epoch 30, quorum 0,1,2 ceph001,ceph002,ceph003
>>>>>       mdsmap e79486: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3
>>>>> up:standby
>>>>>       osdmap e146452: 78 osds: 78 up, 78 in
>>>>>        pgmap v248520: 320 pgs, 4 pools, 15366 GB data, 3841 kobjects
>>>>>              1503 GB used, 129 TB / 131 TB avail
>>>>>                   279 active+clean
>>>>>                    39 active+clean+inconsistent
>>>>>                     1 active+clean+scrubbing+deep
>>>>>                     1 active+clean+scrubbing+deep+inconsistent+repair
>>>>>
>>>>> I let it recovering through the night, but this morning the mons were all
>>>>> gone, nothing to see in the log files.. The osds were all still up!
>>>>>
>>>>>    cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>>>>>     health HEALTH_ERR 36 pgs inconsistent; 1 pgs repair; 36 scrub
>>>>> errors;
>>>>> too few pgs per osd (4 < min 20)
>>>>>     monmap e7: 3 mons at
>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,
>>>>> ceph003=10.141.8.182:6789/0},
>>>>> election epoch 44, quorum 0,1,2 ceph001,ceph002,ceph003
>>>>>     mdsmap e109481: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3
>>>>> up:standby
>>>>>     osdmap e203410: 78 osds: 78 up, 78 in
>>>>>      pgmap v331747: 320 pgs, 4 pools, 15251 GB data, 3812 kobjects
>>>>>            1547 GB used, 129 TB / 131 TB avail
>>>>>                   1 active+clean+scrubbing+deep+inconsistent+repair
>>>>>                 284 active+clean
>>>>>                  35 active+clean+inconsistent
>>>>>
>>>>> I restarted the monitors now, I will let you know when I see something
>>>>> more..
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>>>>     Date: Sun, 24 Aug 2014 12:51:41 +0800
>>>>>
>>>>>     From: Haomai Wang <haomaiwang at gmail.com>
>>>>> Subject: Re: ceph cluster inconsistency?
>>>>>       To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>,
>>>>> ceph-users at lists.ceph.com
>>>>>
>>>>>
>>>>> It's really strange! I write a test program according the key ordering
>>>>>> you provided and parse the corresponding value. It's true!
>>>>>>
>>>>>> I have no idea now. If free, could you add this debug code to
>>>>>> "src/os/GenericObjectMap.cc" and insert *before* "assert(start <=
>>>>>> header.oid);":
>>>>>>
>>>>>>  dout(0) << "start: " << start << "header.oid: " << header.oid <<
>>>>>> dendl;
>>>>>>
>>>>>> Then you need to recompile ceph-osd and run it again. The output log
>>>>>> can help it!
>>>>>>
>>>>>> On Tue, Aug 19, 2014 at 10:19 PM, Haomai Wang <haomaiwang at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> I feel a little embarrassed, 1024 rows still true for me.
>>>>>>>
>>>>>>> I was wondering if you could give your all keys via
>>>>>>> ""ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list
>>>>>>> _GHOBJTOSEQ_ > keys.log?.
>>>>>>>
>>>>>>> thanks!
>>>>>>>
>>>>>>> On Tue, Aug 19, 2014 at 4:58 PM, Kenneth Waegeman
>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>>>>>>> Date: Tue, 19 Aug 2014 12:28:27 +0800
>>>>>>>>
>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com>
>>>>>>>> Subject: Re: ceph cluster inconsistency?
>>>>>>>>   To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>>>>>>>   Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Aug 18, 2014 at 7:32 PM, Kenneth Waegeman
>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>>>>>>>>> Date: Mon, 18 Aug 2014 18:34:11 +0800
>>>>>>>>>>
>>>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com>
>>>>>>>>>> Subject: Re: ceph cluster inconsistency?
>>>>>>>>>>   To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>>>>>>>>>   Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Aug 18, 2014 at 5:38 PM, Kenneth Waegeman
>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I tried this after restarting the osd, but I guess that was not
>>>>>>>>>>>> the
>>>>>>>>>>>> aim
>>>>>>>>>>>> (
>>>>>>>>>>>> # ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list
>>>>>>>>>>>> _GHOBJTOSEQ_|
>>>>>>>>>>>> grep 6adb1100 -A 100
>>>>>>>>>>>> IO error: lock /var/lib/ceph/osd/ceph-67/current//LOCK: Resource
>>>>>>>>>>>> temporarily
>>>>>>>>>>>> unavailable
>>>>>>>>>>>> tools/ceph_kvstore_tool.cc: In function
>>>>>>>>>>>> 'StoreTool::StoreTool(const
>>>>>>>>>>>> string&)' thread 7f8fecf7d780 time 2014-08-18 11:12:29.551780
>>>>>>>>>>>> tools/ceph_kvstore_tool.cc: 38: FAILED
>>>>>>>>>>>> assert(!db_ptr->open(std::cerr))
>>>>>>>>>>>> ..
>>>>>>>>>>>> )
>>>>>>>>>>>>
>>>>>>>>>>>> When I run it after bringing the osd down, it takes a while, but
>>>>>>>>>>>> it
>>>>>>>>>>>> has
>>>>>>>>>>>> no
>>>>>>>>>>>> output.. (When running it without the grep, I'm getting a huge
>>>>>>>>>>>> list
>>>>>>>>>>>> )
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Oh, sorry for it! I made a mistake, the hash value(6adb1100) will
>>>>>>>>>>> be
>>>>>>>>>>> reversed into leveldb.
>>>>>>>>>>> So grep "benchmark_data_ceph001.cubone.os_5560_object789734"
>>>>>>>>>>> should
>>>>>>>>>>> be
>>>>>>>>>>> help it.
>>>>>>>>>>>
>>>>>>>>>>> this gives:
>>>>>>>>>>
>>>>>>>>>> [root at ceph003 ~]# ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/
>>>>>>>>>> current/
>>>>>>>>>> list
>>>>>>>>>> _GHOBJTOSEQ_ | grep 5560_object789734 -A 100
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011BDA6!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object789734!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011C027!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1330170!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011C6FD!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object227366!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011CB03!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1363631!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011CDF0!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1573957!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011D02C!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1019282!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E2B5!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1283563!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E511!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object273736!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E547!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1170628!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011EAAB!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object256335!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011F446!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1484196!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011FC59!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object884178!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001203F3!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object853746!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001208E3!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object36633!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00120B37!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1235337!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001210B6!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1661351!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001210CB!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object238126!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012184C!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object339943!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00121916!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1047094!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001219C1!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object520642!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001222BB!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object639565!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001223AA!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object231080!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012243C!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object858050!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012289C!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object241796!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122D28!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object7462!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122DFE!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object243798!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122EFC!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_8961_object109512!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001232D7!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object653973!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001234A3!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1378169!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123714!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object512925!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001237D9!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object23289!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123854!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1108852!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123971!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object704026!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123F75!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_8961_object250441!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124083!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object706178!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001240FA!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object316952!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012447D!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object538734!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001244D9!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object789215!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001247CD!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_8961_object265993!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124897!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object610597!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124BE4!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object691723!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124C9B!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1306135!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124E1D!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object520580!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012534C!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object659767!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00125A81!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object184060!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00125E77!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1292867!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00126562!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1201410!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00126B34!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1657326!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127383!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1269787!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127396!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object500115!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001277F8!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object394932!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001279DD!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object252963!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127B40!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object936811!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127BAC!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1481773!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012894E!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object999885!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00128D05!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object943667!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012908A!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object212990!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129519!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object437596!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129716!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1585330!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129798!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object603505!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001299C9!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object808800!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129B7A!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object23193!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129B9A!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1158397!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012A932!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object542450!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012B77A!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_8961_object195480!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012BE8C!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object312911!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012BF74!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1563783!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012C65C!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1123980!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012C6FE!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_3411_object913!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012CCAD!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object400863!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012CDBB!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object789667!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012D14B!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1020723!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012D95B!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_8961_object106293!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012E3C8!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1355526!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012E5B3!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1491348!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012F2BB!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_8961_object338872!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012F374!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1337264!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FBE5!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1512395!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FCE3!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_8961_object298610!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FEB6!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object120824!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001301CA!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object816326!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00130263!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object777163!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00130529!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1413173!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001317D9!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object809510!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0013204F!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object471416!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132400!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object695087!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132A19!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object591945!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132BF8!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object302000!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132F5B!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1645443!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00133B8B!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object761911!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0013433E!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object1467727!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134446!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object791960!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134678!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object677078!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134A96!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object254923!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001355D0!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_31461_object321528!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135690!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4919_object36935!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135B62!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object1228272!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135C72!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_4812_object2180!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135DEE!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object425705!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00136366!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object141569!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00136371!!3!!benchmark_data_
>>>>>>>>>> ceph001%ecubone%eos_5560_object564213!head
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> 100 rows seemed true for me. I found the min list objects is 1024.
>>>>>>>>> Please could you run
>>>>>>>>> "ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list
>>>>>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 1024"
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I got the output in attachment
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> Or should I run this immediately after the osd is crashed,
>>>>>>>>>>>> (because
>>>>>>>>>>>> it
>>>>>>>>>>>> maybe
>>>>>>>>>>>> rebalanced?  I did already restarted the cluster)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I don't know if it is related, but before I could all do that, I
>>>>>>>>>>>> had
>>>>>>>>>>>> to
>>>>>>>>>>>> fix
>>>>>>>>>>>> something else: A monitor did run out if disk space, using 8GB for
>>>>>>>>>>>> his
>>>>>>>>>>>> store.db folder (lot of sst files). Other monitors are also near
>>>>>>>>>>>> that
>>>>>>>>>>>> level.
>>>>>>>>>>>> Never had that problem on previous setups before. I recreated a
>>>>>>>>>>>> monitor
>>>>>>>>>>>> and
>>>>>>>>>>>> now it uses 3.8GB.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It exists some duplicate data which needed to be compacted.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Another idea, maybe you can make KeyValueStore's stripe size align
>>>>>>>>>>> with EC stripe size.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> How can I do that? Is there some documentation about that?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ceph --show-config | grep keyvaluestore
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> debug_keyvaluestore = 0/0
>>>>>>>>> keyvaluestore_queue_max_ops = 50
>>>>>>>>> keyvaluestore_queue_max_bytes = 104857600
>>>>>>>>> keyvaluestore_debug_check_backend = false
>>>>>>>>> keyvaluestore_op_threads = 2
>>>>>>>>> keyvaluestore_op_thread_timeout = 60
>>>>>>>>> keyvaluestore_op_thread_suicide_timeout = 180
>>>>>>>>> keyvaluestore_default_strip_size = 4096
>>>>>>>>> keyvaluestore_max_expected_write_size = 16777216
>>>>>>>>> keyvaluestore_header_cache_size = 4096
>>>>>>>>> keyvaluestore_backend = leveldb
>>>>>>>>>
>>>>>>>>> keyvaluestore_default_strip_size is the wanted
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I haven't think deeply and maybe I will try it later.
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Kenneth
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ----- Message from Sage Weil <sweil at redhat.com> ---------
>>>>>>>>>>>> Date: Fri, 15 Aug 2014 06:10:34 -0700 (PDT)
>>>>>>>>>>>> From: Sage Weil <sweil at redhat.com>
>>>>>>>>>>>>
>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency?
>>>>>>>>>>>>   To: Haomai Wang <haomaiwang at gmail.com>
>>>>>>>>>>>>   Cc: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>,
>>>>>>>>>>>> ceph-users at lists.ceph.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, 15 Aug 2014, Haomai Wang wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Kenneth,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't find valuable info in your logs, it lack of the
>>>>>>>>>>>>>> necessary
>>>>>>>>>>>>>> debug output when accessing crash code.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But I scan the encode/decode implementation in GenericObjectMap
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> find something bad.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For example, two oid has same hash and their name is:
>>>>>>>>>>>>>> A: "rb.data.123"
>>>>>>>>>>>>>> B: "rb-123"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In ghobject_t compare level, A < B. But GenericObjectMap encode
>>>>>>>>>>>>>> "."
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> "%e", so the key in DB is:
>>>>>>>>>>>>>> A: _GHOBJTOSEQ_:blah!51615000!!none!!rb%edata%e123!head
>>>>>>>>>>>>>> B: _GHOBJTOSEQ_:blah!51615000!!none!!rb-123!head
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> A > B
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And it seemed that the escape function is useless and should be
>>>>>>>>>>>>>> disabled.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm not sure whether Kenneth's problem is touching this bug.
>>>>>>>>>>>>>> Because
>>>>>>>>>>>>>> this scene only occur when the object set is very large and make
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> two object has same hash value.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenneth, could you have time to run "ceph-kv-store [path-to-osd]
>>>>>>>>>>>>>> list
>>>>>>>>>>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 100". ceph-kv-store is a debug
>>>>>>>>>>>>>> tool
>>>>>>>>>>>>>> which can be compiled from source. You can clone ceph repo and
>>>>>>>>>>>>>> run
>>>>>>>>>>>>>> "./authongen.sh; ./configure; cd src; make ceph-kvstore-tool".
>>>>>>>>>>>>>> "path-to-osd" should be "/var/lib/ceph/osd-[id]/current/".
>>>>>>>>>>>>>> "6adb1100"
>>>>>>>>>>>>>> is from your verbose log and the next 100 rows should know
>>>>>>>>>>>>>> necessary
>>>>>>>>>>>>>> infos.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> You can also get ceph-kvstore-tool from the 'ceph-tests' package.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi sage, do you think we need to provided with upgrade function
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> fix
>>>>>>>>>>>>>> it?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hmm, we might.  This only affects the key/value encoding right?
>>>>>>>>>>>>> The
>>>>>>>>>>>>> FileStore is using its own function to map these to file names?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you open a ticket in the tracker for this?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>> sage
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 7:36 PM, Kenneth Waegeman
>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com>
>>>>>>>>>>>>>>> ---------
>>>>>>>>>>>>>>>  Date: Thu, 14 Aug 2014 19:11:55 +0800
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  From: Haomai Wang <haomaiwang at gmail.com>
>>>>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency?
>>>>>>>>>>>>>>>    To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Could you add config "debug_keyvaluestore = 20/20" to the
>>>>>>>>>>>>>>>> crashed
>>>>>>>>>>>>>>>> osd
>>>>>>>>>>>>>>>> and replay the command causing crash?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I would like to get more debug infos! Thanks.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I included the log in attachment!
>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 4:41 PM, Kenneth Waegeman
>>>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have:
>>>>>>>>>>>>>>>>> osd_objectstore = keyvaluestore-dev
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> in the global section of my ceph.conf
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [root at ceph002 ~]# ceph osd erasure-code-profile get
>>>>>>>>>>>>>>>>> profile11
>>>>>>>>>>>>>>>>> directory=/usr/lib64/ceph/erasure-code
>>>>>>>>>>>>>>>>> k=8
>>>>>>>>>>>>>>>>> m=3
>>>>>>>>>>>>>>>>> plugin=jerasure
>>>>>>>>>>>>>>>>> ruleset-failure-domain=osd
>>>>>>>>>>>>>>>>> technique=reed_sol_van
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the ecdata pool has this as profile
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> pool 3 'ecdata' erasure size 11 min_size 8 crush_ruleset 2
>>>>>>>>>>>>>>>>> object_hash
>>>>>>>>>>>>>>>>> rjenkins pg_num 128 pgp_num 128 last_change 161 flags
>>>>>>>>>>>>>>>>> hashpspool
>>>>>>>>>>>>>>>>> stripe_width 4096
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ECrule in crushmap
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> rule ecdata {
>>>>>>>>>>>>>>>>>       ruleset 2
>>>>>>>>>>>>>>>>>       type erasure
>>>>>>>>>>>>>>>>>       min_size 3
>>>>>>>>>>>>>>>>>       max_size 20
>>>>>>>>>>>>>>>>>       step set_chooseleaf_tries 5
>>>>>>>>>>>>>>>>>       step take default-ec
>>>>>>>>>>>>>>>>>       step choose indep 0 type osd
>>>>>>>>>>>>>>>>>       step emit
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>> root default-ec {
>>>>>>>>>>>>>>>>>       id -8           # do not change unnecessarily
>>>>>>>>>>>>>>>>>       # weight 140.616
>>>>>>>>>>>>>>>>>       alg straw
>>>>>>>>>>>>>>>>>       hash 0  # rjenkins1
>>>>>>>>>>>>>>>>>       item ceph001-ec weight 46.872
>>>>>>>>>>>>>>>>>       item ceph002-ec weight 46.872
>>>>>>>>>>>>>>>>>       item ceph003-ec weight 46.872
>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>>> Kenneth
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com>
>>>>>>>>>>>>>>>>> ---------
>>>>>>>>>>>>>>>>>  Date: Thu, 14 Aug 2014 10:07:50 +0800
>>>>>>>>>>>>>>>>>  From: Haomai Wang <haomaiwang at gmail.com>
>>>>>>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency?
>>>>>>>>>>>>>>>>>    To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>>>>>>>>>>>>>>>>    Cc: ceph-users <ceph-users at lists.ceph.com>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Kenneth,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Could you give your configuration related to EC and
>>>>>>>>>>>>>>>>>> KeyValueStore?
>>>>>>>>>>>>>>>>>> Not sure whether it's bug on KeyValueStore
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 12:06 AM, Kenneth Waegeman
>>>>>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I was doing some tests with rados bench on a Erasure Coded
>>>>>>>>>>>>>>>>>>> pool
>>>>>>>>>>>>>>>>>>> (using
>>>>>>>>>>>>>>>>>>> keyvaluestore-dev objectstore) on 0.83, and I see some
>>>>>>>>>>>>>>>>>>> strangs
>>>>>>>>>>>>>>>>>>> things:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [root at ceph001 ~]# ceph status
>>>>>>>>>>>>>>>>>>>   cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>>>>>>>>>>>>>>>>>>>    health HEALTH_WARN too few pgs per osd (4 < min 20)
>>>>>>>>>>>>>>>>>>>    monmap e1: 3 mons at
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,
>>>>>>>>>>>>>>>>>>> ceph003=10.141.8.182:6789/0},
>>>>>>>>>>>>>>>>>>> election epoch 6, quorum 0,1,2 ceph001,ceph002,ceph003
>>>>>>>>>>>>>>>>>>>    mdsmap e116: 1/1/1 up {0=ceph001.cubone.os=up:active},
>>>>>>>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>>>>>> up:standby
>>>>>>>>>>>>>>>>>>>    osdmap e292: 78 osds: 78 up, 78 in
>>>>>>>>>>>>>>>>>>>     pgmap v48873: 320 pgs, 4 pools, 15366 GB data, 3841
>>>>>>>>>>>>>>>>>>> kobjects
>>>>>>>>>>>>>>>>>>>           1381 GB used, 129 TB / 131 TB avail
>>>>>>>>>>>>>>>>>>>                320 active+clean
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> There is around 15T of data, but only 1.3 T usage.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This is also visible in rados:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [root at ceph001 ~]# rados df
>>>>>>>>>>>>>>>>>>> pool name       category                 KB      objects
>>>>>>>>>>>>>>>>>>> clones
>>>>>>>>>>>>>>>>>>> degraded      unfound           rd        rd KB
>>>>>>>>>>>>>>>>>>> wr
>>>>>>>>>>>>>>>>>>> wr
>>>>>>>>>>>>>>>>>>> KB
>>>>>>>>>>>>>>>>>>> data            -                          0            0
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 0           0            0            0            0
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ecdata          -                16113451009      3933959
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 0           0            1            1      3935632
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 16116850711
>>>>>>>>>>>>>>>>>>> metadata        -                          2           20
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 0           0           33           36           21
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 8
>>>>>>>>>>>>>>>>>>> rbd             -                          0            0
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 0           0            0            0            0
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> total used      1448266016      3933979
>>>>>>>>>>>>>>>>>>> total avail   139400181016
>>>>>>>>>>>>>>>>>>> total space   140848447032
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Another (related?) thing: if I do rados -p ecdata ls, I
>>>>>>>>>>>>>>>>>>> trigger
>>>>>>>>>>>>>>>>>>> osd
>>>>>>>>>>>>>>>>>>> shutdowns (each time):
>>>>>>>>>>>>>>>>>>> I get a list followed by an error:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_8961_object243839
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object801983
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object856489
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_8961_object202232
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_4919_object33199
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object807797
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_4919_object74729
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object1264121
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object1318513
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object1202111
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object939107
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object729682
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object122915
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object76521
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object113261
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object575079
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object671042
>>>>>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object381146
>>>>>>>>>>>>>>>>>>> 2014-08-13 17:57:48.736150 7f65047b5700  0 --
>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1023295 >>
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/4471 pipe(0x7f64fc019b20 sd=5 :0 s=1
>>>>>>>>>>>>>>>>>>> pgs=0
>>>>>>>>>>>>>>>>>>> cs=0
>>>>>>>>>>>>>>>>>>> l=1
>>>>>>>>>>>>>>>>>>> c=0x7f64fc019db0).fault
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> And I can see this in the log files:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>  -25> 2014-08-13 17:52:56.323908 7f8a97fa4700  1 --
>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 <== osd.57 10.141.8.182:0/15796 51
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> 47+0+0
>>>>>>>>>>>>>>>>>>> (3227325175 0 0) 0xf475940 con 0xee89fa0
>>>>>>>>>>>>>>>>>>>  -24> 2014-08-13 17:52:56.323938 7f8a97fa4700  1 --
>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 --> 10.141.8.182:0/15796 --
>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply
>>>>>>>>>>>>>>>>>>> e220
>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf815b00 con
>>>>>>>>>>>>>>>>>>> 0xee89fa0
>>>>>>>>>>>>>>>>>>>  -23> 2014-08-13 17:52:56.324078 7f8a997a7700  1 --
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 <== osd.57 10.141.8.182:0/15796 51
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> 47+0+0
>>>>>>>>>>>>>>>>>>> (3227325175 0 0) 0xf132bc0 con 0xee8a680
>>>>>>>>>>>>>>>>>>>  -22> 2014-08-13 17:52:56.324111 7f8a997a7700  1 --
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 --> 10.141.8.182:0/15796 --
>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply
>>>>>>>>>>>>>>>>>>> e220
>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf811a40 con
>>>>>>>>>>>>>>>>>>> 0xee8a680
>>>>>>>>>>>>>>>>>>>  -21> 2014-08-13 17:52:56.584461 7f8a997a7700  1 --
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 <== osd.29 10.143.8.181:0/12142 47
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> 47+0+0
>>>>>>>>>>>>>>>>>>> (3355887204 0 0) 0xf655940 con 0xee88b00
>>>>>>>>>>>>>>>>>>>  -20> 2014-08-13 17:52:56.584486 7f8a997a7700  1 --
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 --> 10.143.8.181:0/12142 --
>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply
>>>>>>>>>>>>>>>>>>> e220
>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf132bc0 con
>>>>>>>>>>>>>>>>>>> 0xee88b00
>>>>>>>>>>>>>>>>>>>  -19> 2014-08-13 17:52:56.584498 7f8a97fa4700  1 --
>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 <== osd.29 10.143.8.181:0/12142 47
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> 47+0+0
>>>>>>>>>>>>>>>>>>> (3355887204 0 0) 0xf20e040 con 0xee886e0
>>>>>>>>>>>>>>>>>>>  -18> 2014-08-13 17:52:56.584526 7f8a97fa4700  1 --
>>>>>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 --> 10.143.8.181:0/12142 --
>>>>>>>>>>>>>>>>>>> osd_ping(ping_reply
>>>>>>>>>>>>>>>>>>> e220
>>>>>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf475940 con
>>>>>>>>>>>>>>>>>>> 0xee886e0
>>>>>>>>>>>>>>>>>>>  -17> 2014-08-13 17:52:56.594448 7f8a798c7700  1 --
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 >> :/0 pipe(0xec15f00 sd=74 :6839
>>>>>>>>>>>>>>>>>>> s=0
>>>>>>>>>>>>>>>>>>> pgs=0
>>>>>>>>>>>>>>>>>>> cs=0
>>>>>>>>>>>>>>>>>>> l=0
>>>>>>>>>>>>>>>>>>> c=0xee856a0).accept sd=74 10.141.8.180:47641/0
>>>>>>>>>>>>>>>>>>>  -16> 2014-08-13 17:52:56.594921 7f8a798c7700  1 --
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 <== client.7512
>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1018433
>>>>>>>>>>>>>>>>>>> 1
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+39
>>>>>>>>>>>>>>>>>>> (1972163119
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 4174233976) 0xf3bca40 con 0xee856a0
>>>>>>>>>>>>>>>>>>>  -15> 2014-08-13 17:52:56.594957 7f8a798c7700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594874, event: header_read,
>>>>>>>>>>>>>>>>>>> op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>  -14> 2014-08-13 17:52:56.594970 7f8a798c7700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594880, event: throttled,
>>>>>>>>>>>>>>>>>>> op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>  -13> 2014-08-13 17:52:56.594978 7f8a798c7700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594917, event: all_read, op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>  -12> 2014-08-13 17:52:56.594986 7f8a798c7700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 299, time: 0.000000, event: dispatched, op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1
>>>>>>>>>>>>>>>>>>> [pgls
>>>>>>>>>>>>>>>>>>> start_epoch 0] 3.0 ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>  -11> 2014-08-13 17:52:56.595127 7f8a90795700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.595104, event: reached_pg,
>>>>>>>>>>>>>>>>>>> op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>  -10> 2014-08-13 17:52:56.595159 7f8a90795700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.595153, event: started, op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>   -9> 2014-08-13 17:52:56.602179 7f8a90795700  1 --
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 --> 10.141.8.180:0/1018433 --
>>>>>>>>>>>>>>>>>>> osd_op_reply(1
>>>>>>>>>>>>>>>>>>> [pgls
>>>>>>>>>>>>>>>>>>> start_epoch 0] v164'30654 uv30654 ondisk = 0) v6 -- ?+0
>>>>>>>>>>>>>>>>>>> 0xec16180
>>>>>>>>>>>>>>>>>>> con
>>>>>>>>>>>>>>>>>>> 0xee856a0
>>>>>>>>>>>>>>>>>>>   -8> 2014-08-13 17:52:56.602211 7f8a90795700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.602205, event: done, op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>   -7> 2014-08-13 17:52:56.614839 7f8a798c7700  1 --
>>>>>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 <== client.7512
>>>>>>>>>>>>>>>>>>> 10.141.8.180:0/1018433
>>>>>>>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>>>>>> ====
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+89
>>>>>>>>>>>>>>>>>>> (3460833343
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2600845095) 0xf3bcec0 con 0xee856a0
>>>>>>>>>>>>>>>>>>>   -6> 2014-08-13 17:52:56.614864 7f8a798c7700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614789, event: header_read,
>>>>>>>>>>>>>>>>>>> op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>   -5> 2014-08-13 17:52:56.614874 7f8a798c7700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614792, event: throttled,
>>>>>>>>>>>>>>>>>>> op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>   -4> 2014-08-13 17:52:56.614884 7f8a798c7700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614835, event: all_read, op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>   -3> 2014-08-13 17:52:56.614891 7f8a798c7700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 300, time: 0.000000, event: dispatched, op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2
>>>>>>>>>>>>>>>>>>> [pgls
>>>>>>>>>>>>>>>>>>> start_epoch 220] 3.0 ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>   -2> 2014-08-13 17:52:56.614972 7f8a92f9a700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614958, event: reached_pg,
>>>>>>>>>>>>>>>>>>> op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>   -1> 2014-08-13 17:52:56.614993 7f8a92f9a700  5 -- op
>>>>>>>>>>>>>>>>>>> tracker
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>>> seq:
>>>>>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614986, event: started, op:
>>>>>>>>>>>>>>>>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>>>>>>>>>>>>>>> ack+read+known_if_redirected e220)
>>>>>>>>>>>>>>>>>>>    0> 2014-08-13 17:52:56.617087 7f8a92f9a700 -1
>>>>>>>>>>>>>>>>>>> os/GenericObjectMap.cc:
>>>>>>>>>>>>>>>>>>> In function 'int GenericObjectMap::list_objects(const
>>>>>>>>>>>>>>>>>>> coll_t&,
>>>>>>>>>>>>>>>>>>> ghobject_t,
>>>>>>>>>>>>>>>>>>> int, std::vector<ghobject_t>*, ghobject_t*)' thread
>>>>>>>>>>>>>>>>>>> 7f8a92f9a700
>>>>>>>>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>>>> 2014-08-13 17:52:56.615073
>>>>>>>>>>>>>>>>>>> os/GenericObjectMap.cc: 1118: FAILED assert(start <=
>>>>>>>>>>>>>>>>>>> header.oid)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385
>>>>>>>>>>>>>>>>>>> 64c36c92b8)
>>>>>>>>>>>>>>>>>>> 1: (GenericObjectMap::list_objects(coll_t const&,
>>>>>>>>>>>>>>>>>>> ghobject_t,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474)
>>>>>>>>>>>>>>>>>>> [0x98f774]
>>>>>>>>>>>>>>>>>>> 2: (KeyValueStore::collection_list_partial(coll_t,
>>>>>>>>>>>>>>>>>>> ghobject_t,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t,
>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> *,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54]
>>>>>>>>>>>>>>>>>>> 3: (PGBackend::objects_list_partial(hobject_t const&, int,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> snapid_t,
>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9)
>>>>>>>>>>>>>>>>>>> [0x862de9]
>>>>>>>>>>>>>>>>>>> 4:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+
>>>>>>>>>>>>>>>>>>> 0xea5)
>>>>>>>>>>>>>>>>>>> [0x7f67f5]
>>>>>>>>>>>>>>>>>>> 5:
>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1:
>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3)
>>>>>>>>>>>>>>>>>>> [0x8177b3]
>>>>>>>>>>>>>>>>>>> 6: (ReplicatedPG::do_request(std:
>>>>>>>>>>>>>>>>>>> :tr1::shared_ptr<OpRequest>,
>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>>>>>>>>>>>>>>>>>> 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>,
>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d)
>>>>>>>>>>>>>>>>>>> [0x62bf8d]
>>>>>>>>>>>>>>>>>>> 8: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>>>>>>>>>>>>>>>>>> 9: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>>>>>>>>>>>>>>>>>> int)+0x8cd)
>>>>>>>>>>>>>>>>>>> [0xa776fd]
>>>>>>>>>>>>>>>>>>> 10: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>>>>>>>>>>>>>>>>>> [0xa79980]
>>>>>>>>>>>>>>>>>>> 11: (()+0x7df3) [0x7f8aac71fdf3]
>>>>>>>>>>>>>>>>>>> 12: (clone()+0x6d) [0x7f8aab1963dd]
>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS
>>>>>>>>>>>>>>>>>>> <executable>`
>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> needed
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> interpret this.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385
>>>>>>>>>>>>>>>>>>> 64c36c92b8)
>>>>>>>>>>>>>>>>>>> 1: /usr/bin/ceph-osd() [0x99b466]
>>>>>>>>>>>>>>>>>>> 2: (()+0xf130) [0x7f8aac727130]
>>>>>>>>>>>>>>>>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989]
>>>>>>>>>>>>>>>>>>> 4: (abort()+0x148) [0x7f8aab0d7098]
>>>>>>>>>>>>>>>>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165)
>>>>>>>>>>>>>>>>>>> [0x7f8aab9e89d5]
>>>>>>>>>>>>>>>>>>> 6: (()+0x5e946) [0x7f8aab9e6946]
>>>>>>>>>>>>>>>>>>> 7: (()+0x5e973) [0x7f8aab9e6973]
>>>>>>>>>>>>>>>>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f]
>>>>>>>>>>>>>>>>>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int,
>>>>>>>>>>>>>>>>>>> char
>>>>>>>>>>>>>>>>>>> const*)+0x1ef) [0xa8805f]
>>>>>>>>>>>>>>>>>>> 10: (GenericObjectMap::list_objects(coll_t const&,
>>>>>>>>>>>>>>>>>>> ghobject_t,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474)
>>>>>>>>>>>>>>>>>>> [0x98f774]
>>>>>>>>>>>>>>>>>>> 11: (KeyValueStore::collection_list_partial(coll_t,
>>>>>>>>>>>>>>>>>>> ghobject_t,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t,
>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> *,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54]
>>>>>>>>>>>>>>>>>>> 12: (PGBackend::objects_list_partial(hobject_t const&,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> snapid_t,
>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9)
>>>>>>>>>>>>>>>>>>> [0x862de9]
>>>>>>>>>>>>>>>>>>> 13:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+
>>>>>>>>>>>>>>>>>>> 0xea5)
>>>>>>>>>>>>>>>>>>> [0x7f67f5]
>>>>>>>>>>>>>>>>>>> 14:
>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1:
>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3)
>>>>>>>>>>>>>>>>>>> [0x8177b3]
>>>>>>>>>>>>>>>>>>> 15:
>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>>>>>>>>>>>>>>>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>,
>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d)
>>>>>>>>>>>>>>>>>>> [0x62bf8d]
>>>>>>>>>>>>>>>>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>>>>>>>>>>>>>>>>>> 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>>>>>>>>>>>>>>>>>> int)+0x8cd)
>>>>>>>>>>>>>>>>>>> [0xa776fd]
>>>>>>>>>>>>>>>>>>> 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>>>>>>>>>>>>>>>>>> [0xa79980]
>>>>>>>>>>>>>>>>>>> 20: (()+0x7df3) [0x7f8aac71fdf3]
>>>>>>>>>>>>>>>>>>> 21: (clone()+0x6d) [0x7f8aab1963dd]
>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS
>>>>>>>>>>>>>>>>>>> <executable>`
>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> needed
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> interpret this.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --- begin dump of recent events ---
>>>>>>>>>>>>>>>>>>>    0> 2014-08-13 17:52:56.714214 7f8a92f9a700 -1 ***
>>>>>>>>>>>>>>>>>>> Caught
>>>>>>>>>>>>>>>>>>> signal
>>>>>>>>>>>>>>>>>>> (Aborted) **
>>>>>>>>>>>>>>>>>>> in thread 7f8a92f9a700
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b40217385
>>>>>>>>>>>>>>>>>>> 64c36c92b8)
>>>>>>>>>>>>>>>>>>> 1: /usr/bin/ceph-osd() [0x99b466]
>>>>>>>>>>>>>>>>>>> 2: (()+0xf130) [0x7f8aac727130]
>>>>>>>>>>>>>>>>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989]
>>>>>>>>>>>>>>>>>>> 4: (abort()+0x148) [0x7f8aab0d7098]
>>>>>>>>>>>>>>>>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165)
>>>>>>>>>>>>>>>>>>> [0x7f8aab9e89d5]
>>>>>>>>>>>>>>>>>>> 6: (()+0x5e946) [0x7f8aab9e6946]
>>>>>>>>>>>>>>>>>>> 7: (()+0x5e973) [0x7f8aab9e6973]
>>>>>>>>>>>>>>>>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f]
>>>>>>>>>>>>>>>>>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int,
>>>>>>>>>>>>>>>>>>> char
>>>>>>>>>>>>>>>>>>> const*)+0x1ef) [0xa8805f]
>>>>>>>>>>>>>>>>>>> 10: (GenericObjectMap::list_objects(coll_t const&,
>>>>>>>>>>>>>>>>>>> ghobject_t,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x474)
>>>>>>>>>>>>>>>>>>> [0x98f774]
>>>>>>>>>>>>>>>>>>> 11: (KeyValueStore::collection_list_partial(coll_t,
>>>>>>>>>>>>>>>>>>> ghobject_t,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t,
>>>>>>>>>>>>>>>>>>> std::allocator<ghobject_t>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> *,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54]
>>>>>>>>>>>>>>>>>>> 12: (PGBackend::objects_list_partial(hobject_t const&,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> int,
>>>>>>>>>>>>>>>>>>> snapid_t,
>>>>>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>>>>>>>>>>>>>>>>>> hobject_t*)+0x1c9)
>>>>>>>>>>>>>>>>>>> [0x862de9]
>>>>>>>>>>>>>>>>>>> 13:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+
>>>>>>>>>>>>>>>>>>> 0xea5)
>>>>>>>>>>>>>>>>>>> [0x7f67f5]
>>>>>>>>>>>>>>>>>>> 14:
>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1:
>>>>>>>>>>>>>>>>>>> :shared_ptr<OpRequest>)+0x1f3)
>>>>>>>>>>>>>>>>>>> [0x8177b3]
>>>>>>>>>>>>>>>>>>> 15:
>>>>>>>>>>>>>>>>>>> (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>>>>>>>>>>>>>>>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>,
>>>>>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d)
>>>>>>>>>>>>>>>>>>> [0x62bf8d]
>>>>>>>>>>>>>>>>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>>>>>>>>>>>>>>>>>> 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>>>>>>>>>>>>>>>>>> int)+0x8cd)
>>>>>>>>>>>>>>>>>>> [0xa776fd]
>>>>>>>>>>>>>>>>>>> 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>>>>>>>>>>>>>>>>>> [0xa79980]
>>>>>>>>>>>>>>>>>>> 20: (()+0x7df3) [0x7f8aac71fdf3]
>>>>>>>>>>>>>>>>>>> 21: (clone()+0x6d) [0x7f8aab1963dd]
>>>>>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS
>>>>>>>>>>>>>>>>>>> <executable>`
>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> needed
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> interpret this.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I guess this has something to do with using the dev
>>>>>>>>>>>>>>>>>>> Keyvaluestore?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kenneth
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>>>>>>>>>> ceph-users at lists.ceph.com
>>>>>>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Wheat
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com>
>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Met vriendelijke groeten,
>>>>>>>>>>>>>>>>> Kenneth Waegeman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Wheat
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com>
>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Met vriendelijke groeten,
>>>>>>>>>>>>>>> Kenneth Waegeman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Wheat
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>>>>> ceph-users at lists.ceph.com
>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ----- End message from Sage Weil <sweil at redhat.com> -----
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Met vriendelijke groeten,

>>>>>>>>>>>> Kenneth Waegeman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best Regards,
>>>>>>>>>>>
>>>>>>>>>>> Wheat
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Met vriendelijke groeten,
>>>>>>>>>> Kenneth Waegeman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>>
>>>>>>>>> Wheat
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Met vriendelijke groeten,
>>>>>>>> Kenneth Waegeman
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>>
>>>>>>> Wheat
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>>
>>>>>> Wheat
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>>>>
>>>>> --
>>>>>
>>>>> Met vriendelijke groeten,
>>>>> Kenneth Waegeman
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>>
>>>> Wheat
>>>>
>>>
>>>
>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>>
>>> --
>>>
>>> Met vriendelijke groeten,
>>> Kenneth Waegeman
>>>
>>>
>>>
>>
>>
>> --
>>
>> Best Regards,
>>
>> Wheat
>
>
> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>
> -- 
>
> Met vriendelijke groeten,
> Kenneth Waegeman


----- End message from Kenneth Waegeman <Kenneth.Waegeman at UGent.be> -----

-- 

Met vriendelijke groeten,
Kenneth Waegeman