ceph cluster inconsistency?

haomaiwang@xxxxxxxxx (Haomai Wang) · Mon, 18 Aug 2014 18:34:11 +0800

On Mon, Aug 18, 2014 at 5:38 PM, Kenneth Waegeman
<Kenneth.Waegeman at ugent.be> wrote:
> Hi,
>
> I tried this after restarting the osd, but I guess that was not the aim
> (
> # ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list _GHOBJTOSEQ_|
> grep 6adb1100 -A 100
> IO error: lock /var/lib/ceph/osd/ceph-67/current//LOCK: Resource temporarily
> unavailable
> tools/ceph_kvstore_tool.cc: In function 'StoreTool::StoreTool(const
> string&)' thread 7f8fecf7d780 time 2014-08-18 11:12:29.551780
> tools/ceph_kvstore_tool.cc: 38: FAILED assert(!db_ptr->open(std::cerr))
> ..
> )
>
> When I run it after bringing the osd down, it takes a while, but it has no
> output.. (When running it without the grep, I'm getting a huge list )

Oh, sorry for it! I made a mistake, the hash value(6adb1100) will be
reversed into leveldb.
So grep "benchmark_data_ceph001.cubone.os_5560_object789734" should be help it.

>
> Or should I run this immediately after the osd is crashed, (because it maybe
> rebalanced?  I did already restarted the cluster)
>
>
> I don't know if it is related, but before I could all do that, I had to fix
> something else: A monitor did run out if disk space, using 8GB for his
> store.db folder (lot of sst files). Other monitors are also near that level.
> Never had that problem on previous setups before. I recreated a monitor and
> now it uses 3.8GB.

It exists some duplicate data which needed to be compacted.
>

Another idea, maybe you can make KeyValueStore's stripe size align
with EC stripe size.
I haven't think deeply and maybe I will try it later.

> Thanks!
>
> Kenneth
>
>
>
> ----- Message from Sage Weil <sweil at redhat.com> ---------
>    Date: Fri, 15 Aug 2014 06:10:34 -0700 (PDT)
>    From: Sage Weil <sweil at redhat.com>
>
> Subject: Re: [ceph-users] ceph cluster inconsistency?
>      To: Haomai Wang <haomaiwang at gmail.com>
>      Cc: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>,
> ceph-users at lists.ceph.com
>
>
>
>> On Fri, 15 Aug 2014, Haomai Wang wrote:
>>>
>>> Hi Kenneth,
>>>
>>> I don't find valuable info in your logs, it lack of the necessary
>>> debug output when accessing crash code.
>>>
>>> But I scan the encode/decode implementation in GenericObjectMap and
>>> find something bad.
>>>
>>> For example, two oid has same hash and their name is:
>>> A: "rb.data.123"
>>> B: "rb-123"
>>>
>>> In ghobject_t compare level, A < B. But GenericObjectMap encode "." to
>>> "%e", so the key in DB is:
>>> A: _GHOBJTOSEQ_:blah!51615000!!none!!rb%edata%e123!head
>>> B: _GHOBJTOSEQ_:blah!51615000!!none!!rb-123!head
>>>
>>> A > B
>>>
>>> And it seemed that the escape function is useless and should be disabled.
>>>
>>> I'm not sure whether Kenneth's problem is touching this bug. Because
>>> this scene only occur when the object set is very large and make the
>>> two object has same hash value.
>>>
>>> Kenneth, could you have time to run "ceph-kv-store [path-to-osd] list
>>> _GHOBJTOSEQ_| grep 6adb1100 -A 100". ceph-kv-store is a debug tool
>>> which can be compiled from source. You can clone ceph repo and run
>>> "./authongen.sh; ./configure; cd src; make ceph-kvstore-tool".
>>> "path-to-osd" should be "/var/lib/ceph/osd-[id]/current/". "6adb1100"
>>> is from your verbose log and the next 100 rows should know necessary
>>> infos.
>>
>>
>> You can also get ceph-kvstore-tool from the 'ceph-tests' package.
>>
>>> Hi sage, do you think we need to provided with upgrade function to fix
>>> it?
>>
>>
>> Hmm, we might.  This only affects the key/value encoding right?  The
>> FileStore is using its own function to map these to file names?
>>
>> Can you open a ticket in the tracker for this?
>>
>> Thanks!
>> sage
>>
>>>
>>>
>>> On Thu, Aug 14, 2014 at 7:36 PM, Kenneth Waegeman
>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>
>>> >
>>> > ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>> >    Date: Thu, 14 Aug 2014 19:11:55 +0800
>>> >
>>> >    From: Haomai Wang <haomaiwang at gmail.com>
>>> > Subject: Re: [ceph-users] ceph cluster inconsistency?
>>> >      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>> >
>>> >
>>> >> Could you add config "debug_keyvaluestore = 20/20" to the crashed osd
>>> >> and replay the command causing crash?
>>> >>
>>> >> I would like to get more debug infos! Thanks.
>>> >
>>> >
>>> > I included the log in attachment!
>>> > Thanks!
>>> >
>>> >>
>>> >> On Thu, Aug 14, 2014 at 4:41 PM, Kenneth Waegeman
>>> >> <Kenneth.Waegeman at ugent.be> wrote:
>>> >>>
>>> >>>
>>> >>> I have:
>>> >>> osd_objectstore = keyvaluestore-dev
>>> >>>
>>> >>> in the global section of my ceph.conf
>>> >>>
>>> >>>
>>> >>> [root at ceph002 ~]# ceph osd erasure-code-profile get profile11
>>> >>> directory=/usr/lib64/ceph/erasure-code
>>> >>> k=8
>>> >>> m=3
>>> >>> plugin=jerasure
>>> >>> ruleset-failure-domain=osd
>>> >>> technique=reed_sol_van
>>> >>>
>>> >>> the ecdata pool has this as profile
>>> >>>
>>> >>> pool 3 'ecdata' erasure size 11 min_size 8 crush_ruleset 2
>>> >>> object_hash
>>> >>> rjenkins pg_num 128 pgp_num 128 last_change 161 flags hashpspool
>>> >>> stripe_width 4096
>>> >>>
>>> >>> ECrule in crushmap
>>> >>>
>>> >>> rule ecdata {
>>> >>>         ruleset 2
>>> >>>         type erasure
>>> >>>         min_size 3
>>> >>>         max_size 20
>>> >>>         step set_chooseleaf_tries 5
>>> >>>         step take default-ec
>>> >>>         step choose indep 0 type osd
>>> >>>         step emit
>>> >>> }
>>> >>> root default-ec {
>>> >>>         id -8           # do not change unnecessarily
>>> >>>         # weight 140.616
>>> >>>         alg straw
>>> >>>         hash 0  # rjenkins1
>>> >>>         item ceph001-ec weight 46.872
>>> >>>         item ceph002-ec weight 46.872
>>> >>>         item ceph003-ec weight 46.872
>>> >>> ...
>>> >>>
>>> >>> Cheers!
>>> >>> Kenneth
>>> >>>
>>> >>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>> >>>    Date: Thu, 14 Aug 2014 10:07:50 +0800
>>> >>>    From: Haomai Wang <haomaiwang at gmail.com>
>>> >>> Subject: Re: [ceph-users] ceph cluster inconsistency?
>>> >>>      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>> >>>      Cc: ceph-users <ceph-users at lists.ceph.com>
>>> >>>
>>> >>>
>>> >>>
>>> >>>> Hi Kenneth,
>>> >>>>
>>> >>>> Could you give your configuration related to EC and KeyValueStore?
>>> >>>> Not sure whether it's bug on KeyValueStore
>>> >>>>
>>> >>>> On Thu, Aug 14, 2014 at 12:06 AM, Kenneth Waegeman
>>> >>>> <Kenneth.Waegeman at ugent.be> wrote:
>>> >>>>>
>>> >>>>>
>>> >>>>> Hi,
>>> >>>>>
>>> >>>>> I was doing some tests with rados bench on a Erasure Coded pool
>>> >>>>> (using
>>> >>>>> keyvaluestore-dev objectstore) on 0.83, and I see some strangs
>>> >>>>> things:
>>> >>>>>
>>> >>>>>
>>> >>>>> [root at ceph001 ~]# ceph status
>>> >>>>>     cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>>> >>>>>      health HEALTH_WARN too few pgs per osd (4 < min 20)
>>> >>>>>      monmap e1: 3 mons at
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0},
>>> >>>>> election epoch 6, quorum 0,1,2 ceph001,ceph002,ceph003
>>> >>>>>      mdsmap e116: 1/1/1 up {0=ceph001.cubone.os=up:active}, 2
>>> >>>>> up:standby
>>> >>>>>      osdmap e292: 78 osds: 78 up, 78 in
>>> >>>>>       pgmap v48873: 320 pgs, 4 pools, 15366 GB data, 3841 kobjects
>>> >>>>>             1381 GB used, 129 TB / 131 TB avail
>>> >>>>>                  320 active+clean
>>> >>>>>
>>> >>>>> There is around 15T of data, but only 1.3 T usage.
>>> >>>>>
>>> >>>>> This is also visible in rados:
>>> >>>>>
>>> >>>>> [root at ceph001 ~]# rados df
>>> >>>>> pool name       category                 KB      objects
>>> >>>>> clones
>>> >>>>> degraded      unfound           rd        rd KB           wr
>>> >>>>> wr
>>> >>>>> KB
>>> >>>>> data            -                          0            0
>>> >>>>> 0
>>> >>>>> 0           0            0            0            0            0
>>> >>>>> ecdata          -                16113451009      3933959
>>> >>>>> 0
>>> >>>>> 0           0            1            1      3935632  16116850711
>>> >>>>> metadata        -                          2           20
>>> >>>>> 0
>>> >>>>> 0           0           33           36           21            8
>>> >>>>> rbd             -                          0            0
>>> >>>>> 0
>>> >>>>> 0           0            0            0            0            0
>>> >>>>>   total used      1448266016      3933979
>>> >>>>>   total avail   139400181016
>>> >>>>>   total space   140848447032
>>> >>>>>
>>> >>>>>
>>> >>>>> Another (related?) thing: if I do rados -p ecdata ls, I trigger osd
>>> >>>>> shutdowns (each time):
>>> >>>>> I get a list followed by an error:
>>> >>>>>
>>> >>>>> ...
>>> >>>>> benchmark_data_ceph001.cubone.os_8961_object243839
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object801983
>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object856489
>>> >>>>> benchmark_data_ceph001.cubone.os_8961_object202232
>>> >>>>> benchmark_data_ceph001.cubone.os_4919_object33199
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object807797
>>> >>>>> benchmark_data_ceph001.cubone.os_4919_object74729
>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object1264121
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object1318513
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object1202111
>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object939107
>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object729682
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object122915
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object76521
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object113261
>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object575079
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object671042
>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object381146
>>> >>>>> 2014-08-13 17:57:48.736150 7f65047b5700  0 --
>>> >>>>> 10.141.8.180:0/1023295 >>
>>> >>>>> 10.141.8.182:6839/4471 pipe(0x7f64fc019b20 sd=5 :0 s=1 pgs=0 cs=0
>>> >>>>> l=1
>>> >>>>> c=0x7f64fc019db0).fault
>>> >>>>>
>>> >>>>> And I can see this in the log files:
>>> >>>>>
>>> >>>>>    -25> 2014-08-13 17:52:56.323908 7f8a97fa4700  1 --
>>> >>>>> 10.143.8.182:6827/64670 <== osd.57 10.141.8.182:0/15796 51 ====
>>> >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ==== 47+0+0
>>> >>>>> (3227325175 0 0) 0xf475940 con 0xee89fa0
>>> >>>>>    -24> 2014-08-13 17:52:56.323938 7f8a97fa4700  1 --
>>> >>>>> 10.143.8.182:6827/64670 --> 10.141.8.182:0/15796 --
>>> >>>>> osd_ping(ping_reply
>>> >>>>> e220
>>> >>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf815b00 con 0xee89fa0
>>> >>>>>    -23> 2014-08-13 17:52:56.324078 7f8a997a7700  1 --
>>> >>>>> 10.141.8.182:6840/64670 <== osd.57 10.141.8.182:0/15796 51 ====
>>> >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ==== 47+0+0
>>> >>>>> (3227325175 0 0) 0xf132bc0 con 0xee8a680
>>> >>>>>    -22> 2014-08-13 17:52:56.324111 7f8a997a7700  1 --
>>> >>>>> 10.141.8.182:6840/64670 --> 10.141.8.182:0/15796 --
>>> >>>>> osd_ping(ping_reply
>>> >>>>> e220
>>> >>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf811a40 con 0xee8a680
>>> >>>>>    -21> 2014-08-13 17:52:56.584461 7f8a997a7700  1 --
>>> >>>>> 10.141.8.182:6840/64670 <== osd.29 10.143.8.181:0/12142 47 ====
>>> >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ==== 47+0+0
>>> >>>>> (3355887204 0 0) 0xf655940 con 0xee88b00
>>> >>>>>    -20> 2014-08-13 17:52:56.584486 7f8a997a7700  1 --
>>> >>>>> 10.141.8.182:6840/64670 --> 10.143.8.181:0/12142 --
>>> >>>>> osd_ping(ping_reply
>>> >>>>> e220
>>> >>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf132bc0 con 0xee88b00
>>> >>>>>    -19> 2014-08-13 17:52:56.584498 7f8a97fa4700  1 --
>>> >>>>> 10.143.8.182:6827/64670 <== osd.29 10.143.8.181:0/12142 47 ====
>>> >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ==== 47+0+0
>>> >>>>> (3355887204 0 0) 0xf20e040 con 0xee886e0
>>> >>>>>    -18> 2014-08-13 17:52:56.584526 7f8a97fa4700  1 --
>>> >>>>> 10.143.8.182:6827/64670 --> 10.143.8.181:0/12142 --
>>> >>>>> osd_ping(ping_reply
>>> >>>>> e220
>>> >>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf475940 con 0xee886e0
>>> >>>>>    -17> 2014-08-13 17:52:56.594448 7f8a798c7700  1 --
>>> >>>>> 10.141.8.182:6839/64670 >> :/0 pipe(0xec15f00 sd=74 :6839 s=0 pgs=0
>>> >>>>> cs=0
>>> >>>>> l=0
>>> >>>>> c=0xee856a0).accept sd=74 10.141.8.180:47641/0
>>> >>>>>    -16> 2014-08-13 17:52:56.594921 7f8a798c7700  1 --
>>> >>>>> 10.141.8.182:6839/64670 <== client.7512 10.141.8.180:0/1018433 1
>>> >>>>> ====
>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>> >>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+39 (1972163119 0
>>> >>>>> 4174233976) 0xf3bca40 con 0xee856a0
>>> >>>>>    -15> 2014-08-13 17:52:56.594957 7f8a798c7700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 299, time: 2014-08-13 17:52:56.594874, event: header_read, op:
>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>    -14> 2014-08-13 17:52:56.594970 7f8a798c7700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 299, time: 2014-08-13 17:52:56.594880, event: throttled, op:
>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>    -13> 2014-08-13 17:52:56.594978 7f8a798c7700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 299, time: 2014-08-13 17:52:56.594917, event: all_read, op:
>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>    -12> 2014-08-13 17:52:56.594986 7f8a798c7700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 299, time: 0.000000, event: dispatched, op: osd_op(client.7512.0:1
>>> >>>>> [pgls
>>> >>>>> start_epoch 0] 3.0 ack+read+known_if_redirected e220)
>>> >>>>>    -11> 2014-08-13 17:52:56.595127 7f8a90795700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 299, time: 2014-08-13 17:52:56.595104, event: reached_pg, op:
>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>    -10> 2014-08-13 17:52:56.595159 7f8a90795700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 299, time: 2014-08-13 17:52:56.595153, event: started, op:
>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>     -9> 2014-08-13 17:52:56.602179 7f8a90795700  1 --
>>> >>>>> 10.141.8.182:6839/64670 --> 10.141.8.180:0/1018433 --
>>> >>>>> osd_op_reply(1
>>> >>>>> [pgls
>>> >>>>> start_epoch 0] v164'30654 uv30654 ondisk = 0) v6 -- ?+0 0xec16180
>>> >>>>> con
>>> >>>>> 0xee856a0
>>> >>>>>     -8> 2014-08-13 17:52:56.602211 7f8a90795700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 299, time: 2014-08-13 17:52:56.602205, event: done, op:
>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>     -7> 2014-08-13 17:52:56.614839 7f8a798c7700  1 --
>>> >>>>> 10.141.8.182:6839/64670 <== client.7512 10.141.8.180:0/1018433 2
>>> >>>>> ====
>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>> >>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+89 (3460833343 0
>>> >>>>> 2600845095) 0xf3bcec0 con 0xee856a0
>>> >>>>>     -6> 2014-08-13 17:52:56.614864 7f8a798c7700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 300, time: 2014-08-13 17:52:56.614789, event: header_read, op:
>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>     -5> 2014-08-13 17:52:56.614874 7f8a798c7700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 300, time: 2014-08-13 17:52:56.614792, event: throttled, op:
>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>     -4> 2014-08-13 17:52:56.614884 7f8a798c7700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 300, time: 2014-08-13 17:52:56.614835, event: all_read, op:
>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>     -3> 2014-08-13 17:52:56.614891 7f8a798c7700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 300, time: 0.000000, event: dispatched, op: osd_op(client.7512.0:2
>>> >>>>> [pgls
>>> >>>>> start_epoch 220] 3.0 ack+read+known_if_redirected e220)
>>> >>>>>     -2> 2014-08-13 17:52:56.614972 7f8a92f9a700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 300, time: 2014-08-13 17:52:56.614958, event: reached_pg, op:
>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>     -1> 2014-08-13 17:52:56.614993 7f8a92f9a700  5 -- op tracker --
>>> >>>>> ,
>>> >>>>> seq:
>>> >>>>> 300, time: 2014-08-13 17:52:56.614986, event: started, op:
>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>> >>>>> ack+read+known_if_redirected e220)
>>> >>>>>      0> 2014-08-13 17:52:56.617087 7f8a92f9a700 -1
>>> >>>>> os/GenericObjectMap.cc:
>>> >>>>> In function 'int GenericObjectMap::list_objects(const coll_t&,
>>> >>>>> ghobject_t,
>>> >>>>> int, std::vector<ghobject_t>*, ghobject_t*)' thread 7f8a92f9a700
>>> >>>>> time
>>> >>>>> 2014-08-13 17:52:56.615073
>>> >>>>> os/GenericObjectMap.cc: 1118: FAILED assert(start <= header.oid)
>>> >>>>>
>>> >>>>>
>>> >>>>>  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
>>> >>>>>  1: (GenericObjectMap::list_objects(coll_t const&, ghobject_t, int,
>>> >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>> >>>>> ghobject_t*)+0x474)
>>> >>>>> [0x98f774]
>>> >>>>>  2: (KeyValueStore::collection_list_partial(coll_t, ghobject_t,
>>> >>>>> int,
>>> >>>>> int,
>>> >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>> >>>>> ghobject_t*)+0x274) [0x8c5b54]
>>> >>>>>  3: (PGBackend::objects_list_partial(hobject_t const&, int, int,
>>> >>>>> snapid_t,
>>> >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>> >>>>> hobject_t*)+0x1c9)
>>> >>>>> [0x862de9]
>>> >>>>>  4: (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5)
>>> >>>>> [0x7f67f5]
>>> >>>>>  5: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3)
>>> >>>>> [0x8177b3]
>>> >>>>>  6: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>>> >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>> >>>>>  7: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>> >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)
>>> >>>>> [0x62bf8d]
>>> >>>>>  8: (OSD::ShardedOpWQ::_process(unsigned int,
>>> >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>> >>>>>  9: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>> >>>>> int)+0x8cd)
>>> >>>>> [0xa776fd]
>>> >>>>>  10: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>> >>>>> [0xa79980]
>>> >>>>>  11: (()+0x7df3) [0x7f8aac71fdf3]
>>> >>>>>  12: (clone()+0x6d) [0x7f8aab1963dd]
>>> >>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> >>>>> needed
>>> >>>>> to
>>> >>>>> interpret this.
>>> >>>>>
>>> >>>>>
>>> >>>>>  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
>>> >>>>>  1: /usr/bin/ceph-osd() [0x99b466]
>>> >>>>>  2: (()+0xf130) [0x7f8aac727130]
>>> >>>>>  3: (gsignal()+0x39) [0x7f8aab0d5989]
>>> >>>>>  4: (abort()+0x148) [0x7f8aab0d7098]
>>> >>>>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x165)
>>> >>>>> [0x7f8aab9e89d5]
>>> >>>>>  6: (()+0x5e946) [0x7f8aab9e6946]
>>> >>>>>  7: (()+0x5e973) [0x7f8aab9e6973]
>>> >>>>>  8: (()+0x5eb9f) [0x7f8aab9e6b9f]
>>> >>>>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> >>>>> const*)+0x1ef) [0xa8805f]
>>> >>>>>  10: (GenericObjectMap::list_objects(coll_t const&, ghobject_t,
>>> >>>>> int,
>>> >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>> >>>>> ghobject_t*)+0x474)
>>> >>>>> [0x98f774]
>>> >>>>>  11: (KeyValueStore::collection_list_partial(coll_t, ghobject_t,
>>> >>>>> int,
>>> >>>>> int,
>>> >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>> >>>>> ghobject_t*)+0x274) [0x8c5b54]
>>> >>>>>  12: (PGBackend::objects_list_partial(hobject_t const&, int, int,
>>> >>>>> snapid_t,
>>> >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>> >>>>> hobject_t*)+0x1c9)
>>> >>>>> [0x862de9]
>>> >>>>>  13:
>>> >>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5)
>>> >>>>> [0x7f67f5]
>>> >>>>>  14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3)
>>> >>>>> [0x8177b3]
>>> >>>>>  15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>>> >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>> >>>>>  16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>> >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)
>>> >>>>> [0x62bf8d]
>>> >>>>>  17: (OSD::ShardedOpWQ::_process(unsigned int,
>>> >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>> >>>>>  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>> >>>>> int)+0x8cd)
>>> >>>>> [0xa776fd]
>>> >>>>>  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>> >>>>> [0xa79980]
>>> >>>>>  20: (()+0x7df3) [0x7f8aac71fdf3]
>>> >>>>>  21: (clone()+0x6d) [0x7f8aab1963dd]
>>> >>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> >>>>> needed
>>> >>>>> to
>>> >>>>> interpret this.
>>> >>>>>
>>> >>>>> --- begin dump of recent events ---
>>> >>>>>      0> 2014-08-13 17:52:56.714214 7f8a92f9a700 -1 *** Caught
>>> >>>>> signal
>>> >>>>> (Aborted) **
>>> >>>>>  in thread 7f8a92f9a700
>>> >>>>>
>>> >>>>>  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
>>> >>>>>  1: /usr/bin/ceph-osd() [0x99b466]
>>> >>>>>  2: (()+0xf130) [0x7f8aac727130]
>>> >>>>>  3: (gsignal()+0x39) [0x7f8aab0d5989]
>>> >>>>>  4: (abort()+0x148) [0x7f8aab0d7098]
>>> >>>>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x165)
>>> >>>>> [0x7f8aab9e89d5]
>>> >>>>>  6: (()+0x5e946) [0x7f8aab9e6946]
>>> >>>>>  7: (()+0x5e973) [0x7f8aab9e6973]
>>> >>>>>  8: (()+0x5eb9f) [0x7f8aab9e6b9f]
>>> >>>>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> >>>>> const*)+0x1ef) [0xa8805f]
>>> >>>>>  10: (GenericObjectMap::list_objects(coll_t const&, ghobject_t,
>>> >>>>> int,
>>> >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>> >>>>> ghobject_t*)+0x474)
>>> >>>>> [0x98f774]
>>> >>>>>  11: (KeyValueStore::collection_list_partial(coll_t, ghobject_t,
>>> >>>>> int,
>>> >>>>> int,
>>> >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>> >>>>> ghobject_t*)+0x274) [0x8c5b54]
>>> >>>>>  12: (PGBackend::objects_list_partial(hobject_t const&, int, int,
>>> >>>>> snapid_t,
>>> >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>> >>>>> hobject_t*)+0x1c9)
>>> >>>>> [0x862de9]
>>> >>>>>  13:
>>> >>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5)
>>> >>>>> [0x7f67f5]
>>> >>>>>  14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3)
>>> >>>>> [0x8177b3]
>>> >>>>>  15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>>> >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>> >>>>>  16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>> >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)
>>> >>>>> [0x62bf8d]
>>> >>>>>  17: (OSD::ShardedOpWQ::_process(unsigned int,
>>> >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>> >>>>>  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>> >>>>> int)+0x8cd)
>>> >>>>> [0xa776fd]
>>> >>>>>  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>> >>>>> [0xa79980]
>>> >>>>>  20: (()+0x7df3) [0x7f8aac71fdf3]
>>> >>>>>  21: (clone()+0x6d) [0x7f8aab1963dd]
>>> >>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> >>>>> needed
>>> >>>>> to
>>> >>>>> interpret this.
>>> >>>>>
>>> >>>>> I guess this has something to do with using the dev Keyvaluestore?
>>> >>>>>
>>> >>>>>
>>> >>>>> Thanks!
>>> >>>>>
>>> >>>>> Kenneth
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> ceph-users mailing list
>>> >>>>> ceph-users at lists.ceph.com
>>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Best Regards,
>>> >>>>
>>> >>>> Wheat
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>> >>>
>>> >>> --
>>> >>>
>>> >>> Met vriendelijke groeten,
>>> >>> Kenneth Waegeman
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Best Regards,
>>> >>
>>> >> Wheat
>>> >
>>> >
>>> >
>>> > ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>> >
>>> > --
>>> >
>>> > Met vriendelijke groeten,
>>> > Kenneth Waegeman
>>> >
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>> Wheat
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users at lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>
>
> ----- End message from Sage Weil <sweil at redhat.com> -----
>
>
> --
>
> Met vriendelijke groeten,
> Kenneth Waegeman
>

-- 
Best Regards,

Wheat