ceph cluster inconsistency?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
    Date: Tue, 19 Aug 2014 12:28:27 +0800
    From: Haomai Wang <haomaiwang at gmail.com>
Subject: Re: ceph cluster inconsistency?
      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
      Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com


> On Mon, Aug 18, 2014 at 7:32 PM, Kenneth Waegeman
> <Kenneth.Waegeman at ugent.be> wrote:
>>
>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>    Date: Mon, 18 Aug 2014 18:34:11 +0800
>>
>>    From: Haomai Wang <haomaiwang at gmail.com>
>> Subject: Re: [ceph-users] ceph cluster inconsistency?
>>      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>      Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com
>>
>>
>>
>>> On Mon, Aug 18, 2014 at 5:38 PM, Kenneth Waegeman
>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I tried this after restarting the osd, but I guess that was not the aim
>>>> (
>>>> # ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list _GHOBJTOSEQ_|
>>>> grep 6adb1100 -A 100
>>>> IO error: lock /var/lib/ceph/osd/ceph-67/current//LOCK: Resource
>>>> temporarily
>>>> unavailable
>>>> tools/ceph_kvstore_tool.cc: In function 'StoreTool::StoreTool(const
>>>> string&)' thread 7f8fecf7d780 time 2014-08-18 11:12:29.551780
>>>> tools/ceph_kvstore_tool.cc: 38: FAILED assert(!db_ptr->open(std::cerr))
>>>> ..
>>>> )
>>>>
>>>> When I run it after bringing the osd down, it takes a while, but it has
>>>> no
>>>> output.. (When running it without the grep, I'm getting a huge list )
>>>
>>>
>>> Oh, sorry for it! I made a mistake, the hash value(6adb1100) will be
>>> reversed into leveldb.
>>> So grep "benchmark_data_ceph001.cubone.os_5560_object789734" should be
>>> help it.
>>>
>> this gives:
>>
>> [root at ceph003 ~]# ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list
>> _GHOBJTOSEQ_ | grep 5560_object789734 -A 100
>> _GHOBJTOSEQ_:3%e0s0_head!0011BDA6!!3!!benchmark_data_ceph001%ecubone%eos_5560_object789734!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011C027!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1330170!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011C6FD!!3!!benchmark_data_ceph001%ecubone%eos_4919_object227366!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011CB03!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1363631!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011CDF0!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1573957!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011D02C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1019282!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011E2B5!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1283563!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011E511!!3!!benchmark_data_ceph001%ecubone%eos_4919_object273736!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011E547!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1170628!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011EAAB!!3!!benchmark_data_ceph001%ecubone%eos_4919_object256335!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011F446!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1484196!head
>> _GHOBJTOSEQ_:3%e0s0_head!0011FC59!!3!!benchmark_data_ceph001%ecubone%eos_5560_object884178!head
>> _GHOBJTOSEQ_:3%e0s0_head!001203F3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object853746!head
>> _GHOBJTOSEQ_:3%e0s0_head!001208E3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object36633!head
>> _GHOBJTOSEQ_:3%e0s0_head!00120B37!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1235337!head
>> _GHOBJTOSEQ_:3%e0s0_head!001210B6!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1661351!head
>> _GHOBJTOSEQ_:3%e0s0_head!001210CB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object238126!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012184C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object339943!head
>> _GHOBJTOSEQ_:3%e0s0_head!00121916!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1047094!head
>> _GHOBJTOSEQ_:3%e0s0_head!001219C1!!3!!benchmark_data_ceph001%ecubone%eos_31461_object520642!head
>> _GHOBJTOSEQ_:3%e0s0_head!001222BB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object639565!head
>> _GHOBJTOSEQ_:3%e0s0_head!001223AA!!3!!benchmark_data_ceph001%ecubone%eos_4919_object231080!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012243C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object858050!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012289C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object241796!head
>> _GHOBJTOSEQ_:3%e0s0_head!00122D28!!3!!benchmark_data_ceph001%ecubone%eos_4919_object7462!head
>> _GHOBJTOSEQ_:3%e0s0_head!00122DFE!!3!!benchmark_data_ceph001%ecubone%eos_5560_object243798!head
>> _GHOBJTOSEQ_:3%e0s0_head!00122EFC!!3!!benchmark_data_ceph001%ecubone%eos_8961_object109512!head
>> _GHOBJTOSEQ_:3%e0s0_head!001232D7!!3!!benchmark_data_ceph001%ecubone%eos_31461_object653973!head
>> _GHOBJTOSEQ_:3%e0s0_head!001234A3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1378169!head
>> _GHOBJTOSEQ_:3%e0s0_head!00123714!!3!!benchmark_data_ceph001%ecubone%eos_5560_object512925!head
>> _GHOBJTOSEQ_:3%e0s0_head!001237D9!!3!!benchmark_data_ceph001%ecubone%eos_4919_object23289!head
>> _GHOBJTOSEQ_:3%e0s0_head!00123854!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1108852!head
>> _GHOBJTOSEQ_:3%e0s0_head!00123971!!3!!benchmark_data_ceph001%ecubone%eos_5560_object704026!head
>> _GHOBJTOSEQ_:3%e0s0_head!00123F75!!3!!benchmark_data_ceph001%ecubone%eos_8961_object250441!head
>> _GHOBJTOSEQ_:3%e0s0_head!00124083!!3!!benchmark_data_ceph001%ecubone%eos_31461_object706178!head
>> _GHOBJTOSEQ_:3%e0s0_head!001240FA!!3!!benchmark_data_ceph001%ecubone%eos_5560_object316952!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012447D!!3!!benchmark_data_ceph001%ecubone%eos_5560_object538734!head
>> _GHOBJTOSEQ_:3%e0s0_head!001244D9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object789215!head
>> _GHOBJTOSEQ_:3%e0s0_head!001247CD!!3!!benchmark_data_ceph001%ecubone%eos_8961_object265993!head
>> _GHOBJTOSEQ_:3%e0s0_head!00124897!!3!!benchmark_data_ceph001%ecubone%eos_31461_object610597!head
>> _GHOBJTOSEQ_:3%e0s0_head!00124BE4!!3!!benchmark_data_ceph001%ecubone%eos_31461_object691723!head
>> _GHOBJTOSEQ_:3%e0s0_head!00124C9B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1306135!head
>> _GHOBJTOSEQ_:3%e0s0_head!00124E1D!!3!!benchmark_data_ceph001%ecubone%eos_5560_object520580!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012534C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object659767!head
>> _GHOBJTOSEQ_:3%e0s0_head!00125A81!!3!!benchmark_data_ceph001%ecubone%eos_5560_object184060!head
>> _GHOBJTOSEQ_:3%e0s0_head!00125E77!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1292867!head
>> _GHOBJTOSEQ_:3%e0s0_head!00126562!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1201410!head
>> _GHOBJTOSEQ_:3%e0s0_head!00126B34!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1657326!head
>> _GHOBJTOSEQ_:3%e0s0_head!00127383!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1269787!head
>> _GHOBJTOSEQ_:3%e0s0_head!00127396!!3!!benchmark_data_ceph001%ecubone%eos_31461_object500115!head
>> _GHOBJTOSEQ_:3%e0s0_head!001277F8!!3!!benchmark_data_ceph001%ecubone%eos_31461_object394932!head
>> _GHOBJTOSEQ_:3%e0s0_head!001279DD!!3!!benchmark_data_ceph001%ecubone%eos_4919_object252963!head
>> _GHOBJTOSEQ_:3%e0s0_head!00127B40!!3!!benchmark_data_ceph001%ecubone%eos_31461_object936811!head
>> _GHOBJTOSEQ_:3%e0s0_head!00127BAC!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1481773!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012894E!!3!!benchmark_data_ceph001%ecubone%eos_5560_object999885!head
>> _GHOBJTOSEQ_:3%e0s0_head!00128D05!!3!!benchmark_data_ceph001%ecubone%eos_31461_object943667!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012908A!!3!!benchmark_data_ceph001%ecubone%eos_5560_object212990!head
>> _GHOBJTOSEQ_:3%e0s0_head!00129519!!3!!benchmark_data_ceph001%ecubone%eos_5560_object437596!head
>> _GHOBJTOSEQ_:3%e0s0_head!00129716!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1585330!head
>> _GHOBJTOSEQ_:3%e0s0_head!00129798!!3!!benchmark_data_ceph001%ecubone%eos_5560_object603505!head
>> _GHOBJTOSEQ_:3%e0s0_head!001299C9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object808800!head
>> _GHOBJTOSEQ_:3%e0s0_head!00129B7A!!3!!benchmark_data_ceph001%ecubone%eos_31461_object23193!head
>> _GHOBJTOSEQ_:3%e0s0_head!00129B9A!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1158397!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012A932!!3!!benchmark_data_ceph001%ecubone%eos_5560_object542450!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012B77A!!3!!benchmark_data_ceph001%ecubone%eos_8961_object195480!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012BE8C!!3!!benchmark_data_ceph001%ecubone%eos_4919_object312911!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012BF74!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1563783!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012C65C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1123980!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012C6FE!!3!!benchmark_data_ceph001%ecubone%eos_3411_object913!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012CCAD!!3!!benchmark_data_ceph001%ecubone%eos_31461_object400863!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012CDBB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object789667!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012D14B!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1020723!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012D95B!!3!!benchmark_data_ceph001%ecubone%eos_8961_object106293!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012E3C8!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1355526!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012E5B3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1491348!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012F2BB!!3!!benchmark_data_ceph001%ecubone%eos_8961_object338872!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012F374!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1337264!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012FBE5!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1512395!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012FCE3!!3!!benchmark_data_ceph001%ecubone%eos_8961_object298610!head
>> _GHOBJTOSEQ_:3%e0s0_head!0012FEB6!!3!!benchmark_data_ceph001%ecubone%eos_4919_object120824!head
>> _GHOBJTOSEQ_:3%e0s0_head!001301CA!!3!!benchmark_data_ceph001%ecubone%eos_5560_object816326!head
>> _GHOBJTOSEQ_:3%e0s0_head!00130263!!3!!benchmark_data_ceph001%ecubone%eos_5560_object777163!head
>> _GHOBJTOSEQ_:3%e0s0_head!00130529!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1413173!head
>> _GHOBJTOSEQ_:3%e0s0_head!001317D9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object809510!head
>> _GHOBJTOSEQ_:3%e0s0_head!0013204F!!3!!benchmark_data_ceph001%ecubone%eos_31461_object471416!head
>> _GHOBJTOSEQ_:3%e0s0_head!00132400!!3!!benchmark_data_ceph001%ecubone%eos_5560_object695087!head
>> _GHOBJTOSEQ_:3%e0s0_head!00132A19!!3!!benchmark_data_ceph001%ecubone%eos_31461_object591945!head
>> _GHOBJTOSEQ_:3%e0s0_head!00132BF8!!3!!benchmark_data_ceph001%ecubone%eos_31461_object302000!head
>> _GHOBJTOSEQ_:3%e0s0_head!00132F5B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1645443!head
>> _GHOBJTOSEQ_:3%e0s0_head!00133B8B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object761911!head
>> _GHOBJTOSEQ_:3%e0s0_head!0013433E!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1467727!head
>> _GHOBJTOSEQ_:3%e0s0_head!00134446!!3!!benchmark_data_ceph001%ecubone%eos_31461_object791960!head
>> _GHOBJTOSEQ_:3%e0s0_head!00134678!!3!!benchmark_data_ceph001%ecubone%eos_31461_object677078!head
>> _GHOBJTOSEQ_:3%e0s0_head!00134A96!!3!!benchmark_data_ceph001%ecubone%eos_31461_object254923!head
>> _GHOBJTOSEQ_:3%e0s0_head!001355D0!!3!!benchmark_data_ceph001%ecubone%eos_31461_object321528!head
>> _GHOBJTOSEQ_:3%e0s0_head!00135690!!3!!benchmark_data_ceph001%ecubone%eos_4919_object36935!head
>> _GHOBJTOSEQ_:3%e0s0_head!00135B62!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1228272!head
>> _GHOBJTOSEQ_:3%e0s0_head!00135C72!!3!!benchmark_data_ceph001%ecubone%eos_4812_object2180!head
>> _GHOBJTOSEQ_:3%e0s0_head!00135DEE!!3!!benchmark_data_ceph001%ecubone%eos_5560_object425705!head
>> _GHOBJTOSEQ_:3%e0s0_head!00136366!!3!!benchmark_data_ceph001%ecubone%eos_5560_object141569!head
>> _GHOBJTOSEQ_:3%e0s0_head!00136371!!3!!benchmark_data_ceph001%ecubone%eos_5560_object564213!head
>>
>>
>>
>
> 100 rows seemed true for me. I found the min list objects is 1024.
> Please could you run
> "ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list
> _GHOBJTOSEQ_| grep 6adb1100 -A 1024"

I got the output in attachment

>
>>>>
>>>> Or should I run this immediately after the osd is crashed, (because it
>>>> maybe
>>>> rebalanced?  I did already restarted the cluster)
>>>>
>>>>
>>>> I don't know if it is related, but before I could all do that, I had to
>>>> fix
>>>> something else: A monitor did run out if disk space, using 8GB for his
>>>> store.db folder (lot of sst files). Other monitors are also near that
>>>> level.
>>>> Never had that problem on previous setups before. I recreated a monitor
>>>> and
>>>> now it uses 3.8GB.
>>>
>>>
>>> It exists some duplicate data which needed to be compacted.
>>>>
>>>>
>>>
>>> Another idea, maybe you can make KeyValueStore's stripe size align
>>> with EC stripe size.
>>
>> How can I do that? Is there some documentation about that?
>
>> ceph --show-config | grep keyvaluestore
> debug_keyvaluestore = 0/0
> keyvaluestore_queue_max_ops = 50
> keyvaluestore_queue_max_bytes = 104857600
> keyvaluestore_debug_check_backend = false
> keyvaluestore_op_threads = 2
> keyvaluestore_op_thread_timeout = 60
> keyvaluestore_op_thread_suicide_timeout = 180
> keyvaluestore_default_strip_size = 4096
> keyvaluestore_max_expected_write_size = 16777216
> keyvaluestore_header_cache_size = 4096
> keyvaluestore_backend = leveldb
>
> keyvaluestore_default_strip_size is the wanted
>
>>
>>
>>> I haven't think deeply and maybe I will try it later.
>>>
>>>> Thanks!
>>>>
>>>> Kenneth
>>>>
>>>>
>>>>
>>>> ----- Message from Sage Weil <sweil at redhat.com> ---------
>>>>    Date: Fri, 15 Aug 2014 06:10:34 -0700 (PDT)
>>>>    From: Sage Weil <sweil at redhat.com>
>>>>
>>>> Subject: Re: [ceph-users] ceph cluster inconsistency?
>>>>      To: Haomai Wang <haomaiwang at gmail.com>
>>>>      Cc: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>,
>>>> ceph-users at lists.ceph.com
>>>>
>>>>
>>>>
>>>>> On Fri, 15 Aug 2014, Haomai Wang wrote:
>>>>>>
>>>>>>
>>>>>> Hi Kenneth,
>>>>>>
>>>>>> I don't find valuable info in your logs, it lack of the necessary
>>>>>> debug output when accessing crash code.
>>>>>>
>>>>>> But I scan the encode/decode implementation in GenericObjectMap and
>>>>>> find something bad.
>>>>>>
>>>>>> For example, two oid has same hash and their name is:
>>>>>> A: "rb.data.123"
>>>>>> B: "rb-123"
>>>>>>
>>>>>> In ghobject_t compare level, A < B. But GenericObjectMap encode "." to
>>>>>> "%e", so the key in DB is:
>>>>>> A: _GHOBJTOSEQ_:blah!51615000!!none!!rb%edata%e123!head
>>>>>> B: _GHOBJTOSEQ_:blah!51615000!!none!!rb-123!head
>>>>>>
>>>>>> A > B
>>>>>>
>>>>>> And it seemed that the escape function is useless and should be
>>>>>> disabled.
>>>>>>
>>>>>> I'm not sure whether Kenneth's problem is touching this bug. Because
>>>>>> this scene only occur when the object set is very large and make the
>>>>>> two object has same hash value.
>>>>>>
>>>>>> Kenneth, could you have time to run "ceph-kv-store [path-to-osd] list
>>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 100". ceph-kv-store is a debug tool
>>>>>> which can be compiled from source. You can clone ceph repo and run
>>>>>> "./authongen.sh; ./configure; cd src; make ceph-kvstore-tool".
>>>>>> "path-to-osd" should be "/var/lib/ceph/osd-[id]/current/". "6adb1100"
>>>>>> is from your verbose log and the next 100 rows should know necessary
>>>>>> infos.
>>>>>
>>>>>
>>>>>
>>>>> You can also get ceph-kvstore-tool from the 'ceph-tests' package.
>>>>>
>>>>>> Hi sage, do you think we need to provided with upgrade function to fix
>>>>>> it?
>>>>>
>>>>>
>>>>>
>>>>> Hmm, we might.  This only affects the key/value encoding right?  The
>>>>> FileStore is using its own function to map these to file names?
>>>>>
>>>>> Can you open a ticket in the tracker for this?
>>>>>
>>>>> Thanks!
>>>>> sage
>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 14, 2014 at 7:36 PM, Kenneth Waegeman
>>>>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>>
>>>>>> >
>>>>>> > ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>>>>> >    Date: Thu, 14 Aug 2014 19:11:55 +0800
>>>>>> >
>>>>>> >    From: Haomai Wang <haomaiwang at gmail.com>
>>>>>> > Subject: Re: [ceph-users] ceph cluster inconsistency?
>>>>>> >      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>>>>> >
>>>>>> >
>>>>>> >> Could you add config "debug_keyvaluestore = 20/20" to the crashed
>>>>>> >> osd
>>>>>> >> and replay the command causing crash?
>>>>>> >>
>>>>>> >> I would like to get more debug infos! Thanks.
>>>>>> >
>>>>>> >
>>>>>> > I included the log in attachment!
>>>>>> > Thanks!
>>>>>> >
>>>>>> >>
>>>>>> >> On Thu, Aug 14, 2014 at 4:41 PM, Kenneth Waegeman
>>>>>> >> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> I have:
>>>>>> >>> osd_objectstore = keyvaluestore-dev
>>>>>> >>>
>>>>>> >>> in the global section of my ceph.conf
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> [root at ceph002 ~]# ceph osd erasure-code-profile get profile11
>>>>>> >>> directory=/usr/lib64/ceph/erasure-code
>>>>>> >>> k=8
>>>>>> >>> m=3
>>>>>> >>> plugin=jerasure
>>>>>> >>> ruleset-failure-domain=osd
>>>>>> >>> technique=reed_sol_van
>>>>>> >>>
>>>>>> >>> the ecdata pool has this as profile
>>>>>> >>>
>>>>>> >>> pool 3 'ecdata' erasure size 11 min_size 8 crush_ruleset 2
>>>>>> >>> object_hash
>>>>>> >>> rjenkins pg_num 128 pgp_num 128 last_change 161 flags hashpspool
>>>>>> >>> stripe_width 4096
>>>>>> >>>
>>>>>> >>> ECrule in crushmap
>>>>>> >>>
>>>>>> >>> rule ecdata {
>>>>>> >>>         ruleset 2
>>>>>> >>>         type erasure
>>>>>> >>>         min_size 3
>>>>>> >>>         max_size 20
>>>>>> >>>         step set_chooseleaf_tries 5
>>>>>> >>>         step take default-ec
>>>>>> >>>         step choose indep 0 type osd
>>>>>> >>>         step emit
>>>>>> >>> }
>>>>>> >>> root default-ec {
>>>>>> >>>         id -8           # do not change unnecessarily
>>>>>> >>>         # weight 140.616
>>>>>> >>>         alg straw
>>>>>> >>>         hash 0  # rjenkins1
>>>>>> >>>         item ceph001-ec weight 46.872
>>>>>> >>>         item ceph002-ec weight 46.872
>>>>>> >>>         item ceph003-ec weight 46.872
>>>>>> >>> ...
>>>>>> >>>
>>>>>> >>> Cheers!
>>>>>> >>> Kenneth
>>>>>> >>>
>>>>>> >>> ----- Message from Haomai Wang <haomaiwang at gmail.com> ---------
>>>>>> >>>    Date: Thu, 14 Aug 2014 10:07:50 +0800
>>>>>> >>>    From: Haomai Wang <haomaiwang at gmail.com>
>>>>>> >>> Subject: Re: [ceph-users] ceph cluster inconsistency?
>>>>>> >>>      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
>>>>>> >>>      Cc: ceph-users <ceph-users at lists.ceph.com>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>> Hi Kenneth,
>>>>>> >>>>
>>>>>> >>>> Could you give your configuration related to EC and KeyValueStore?
>>>>>> >>>> Not sure whether it's bug on KeyValueStore
>>>>>> >>>>
>>>>>> >>>> On Thu, Aug 14, 2014 at 12:06 AM, Kenneth Waegeman
>>>>>> >>>> <Kenneth.Waegeman at ugent.be> wrote:
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> Hi,
>>>>>> >>>>>
>>>>>> >>>>> I was doing some tests with rados bench on a Erasure Coded pool
>>>>>> >>>>> (using
>>>>>> >>>>> keyvaluestore-dev objectstore) on 0.83, and I see some strangs
>>>>>> >>>>> things:
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> [root at ceph001 ~]# ceph status
>>>>>> >>>>>     cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
>>>>>> >>>>>      health HEALTH_WARN too few pgs per osd (4 < min 20)
>>>>>> >>>>>      monmap e1: 3 mons at
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>>  
>>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0},
>>>>>> >>>>> election epoch 6, quorum 0,1,2 ceph001,ceph002,ceph003
>>>>>> >>>>>      mdsmap e116: 1/1/1 up {0=ceph001.cubone.os=up:active}, 2
>>>>>> >>>>> up:standby
>>>>>> >>>>>      osdmap e292: 78 osds: 78 up, 78 in
>>>>>> >>>>>       pgmap v48873: 320 pgs, 4 pools, 15366 GB data, 3841
>>>>>> >>>>> kobjects
>>>>>> >>>>>             1381 GB used, 129 TB / 131 TB avail
>>>>>> >>>>>                  320 active+clean
>>>>>> >>>>>
>>>>>> >>>>> There is around 15T of data, but only 1.3 T usage.
>>>>>> >>>>>
>>>>>> >>>>> This is also visible in rados:
>>>>>> >>>>>
>>>>>> >>>>> [root at ceph001 ~]# rados df
>>>>>> >>>>> pool name       category                 KB      objects
>>>>>> >>>>> clones
>>>>>> >>>>> degraded      unfound           rd        rd KB           wr
>>>>>> >>>>> wr
>>>>>> >>>>> KB
>>>>>> >>>>> data            -                          0            0
>>>>>> >>>>> 0
>>>>>> >>>>> 0           0            0            0            0            0
>>>>>> >>>>> ecdata          -                16113451009      3933959
>>>>>> >>>>> 0
>>>>>> >>>>> 0           0            1            1      3935632  16116850711
>>>>>> >>>>> metadata        -                          2           20
>>>>>> >>>>> 0
>>>>>> >>>>> 0           0           33           36           21            8
>>>>>> >>>>> rbd             -                          0            0
>>>>>> >>>>> 0
>>>>>> >>>>> 0           0            0            0            0            0
>>>>>> >>>>>   total used      1448266016      3933979
>>>>>> >>>>>   total avail   139400181016
>>>>>> >>>>>   total space   140848447032
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> Another (related?) thing: if I do rados -p ecdata ls, I trigger
>>>>>> >>>>> osd
>>>>>> >>>>> shutdowns (each time):
>>>>>> >>>>> I get a list followed by an error:
>>>>>> >>>>>
>>>>>> >>>>> ...
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_8961_object243839
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object801983
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object856489
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_8961_object202232
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_4919_object33199
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object807797
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_4919_object74729
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object1264121
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object1318513
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object1202111
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object939107
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object729682
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object122915
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object76521
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object113261
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_31461_object575079
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object671042
>>>>>> >>>>> benchmark_data_ceph001.cubone.os_5560_object381146
>>>>>> >>>>> 2014-08-13 17:57:48.736150 7f65047b5700  0 --
>>>>>> >>>>> 10.141.8.180:0/1023295 >>
>>>>>> >>>>> 10.141.8.182:6839/4471 pipe(0x7f64fc019b20 sd=5 :0 s=1 pgs=0 cs=0
>>>>>> >>>>> l=1
>>>>>> >>>>> c=0x7f64fc019db0).fault
>>>>>> >>>>>
>>>>>> >>>>> And I can see this in the log files:
>>>>>> >>>>>
>>>>>> >>>>>    -25> 2014-08-13 17:52:56.323908 7f8a97fa4700  1 --
>>>>>> >>>>> 10.143.8.182:6827/64670 <== osd.57 10.141.8.182:0/15796 51 ====
>>>>>> >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ====
>>>>>> >>>>> 47+0+0
>>>>>> >>>>> (3227325175 0 0) 0xf475940 con 0xee89fa0
>>>>>> >>>>>    -24> 2014-08-13 17:52:56.323938 7f8a97fa4700  1 --
>>>>>> >>>>> 10.143.8.182:6827/64670 --> 10.141.8.182:0/15796 --
>>>>>> >>>>> osd_ping(ping_reply
>>>>>> >>>>> e220
>>>>>> >>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf815b00 con
>>>>>> >>>>> 0xee89fa0
>>>>>> >>>>>    -23> 2014-08-13 17:52:56.324078 7f8a997a7700  1 --
>>>>>> >>>>> 10.141.8.182:6840/64670 <== osd.57 10.141.8.182:0/15796 51 ====
>>>>>> >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ====
>>>>>> >>>>> 47+0+0
>>>>>> >>>>> (3227325175 0 0) 0xf132bc0 con 0xee8a680
>>>>>> >>>>>    -22> 2014-08-13 17:52:56.324111 7f8a997a7700  1 --
>>>>>> >>>>> 10.141.8.182:6840/64670 --> 10.141.8.182:0/15796 --
>>>>>> >>>>> osd_ping(ping_reply
>>>>>> >>>>> e220
>>>>>> >>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf811a40 con
>>>>>> >>>>> 0xee8a680
>>>>>> >>>>>    -21> 2014-08-13 17:52:56.584461 7f8a997a7700  1 --
>>>>>> >>>>> 10.141.8.182:6840/64670 <== osd.29 10.143.8.181:0/12142 47 ====
>>>>>> >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ====
>>>>>> >>>>> 47+0+0
>>>>>> >>>>> (3355887204 0 0) 0xf655940 con 0xee88b00
>>>>>> >>>>>    -20> 2014-08-13 17:52:56.584486 7f8a997a7700  1 --
>>>>>> >>>>> 10.141.8.182:6840/64670 --> 10.143.8.181:0/12142 --
>>>>>> >>>>> osd_ping(ping_reply
>>>>>> >>>>> e220
>>>>>> >>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf132bc0 con
>>>>>> >>>>> 0xee88b00
>>>>>> >>>>>    -19> 2014-08-13 17:52:56.584498 7f8a97fa4700  1 --
>>>>>> >>>>> 10.143.8.182:6827/64670 <== osd.29 10.143.8.181:0/12142 47 ====
>>>>>> >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ====
>>>>>> >>>>> 47+0+0
>>>>>> >>>>> (3355887204 0 0) 0xf20e040 con 0xee886e0
>>>>>> >>>>>    -18> 2014-08-13 17:52:56.584526 7f8a97fa4700  1 --
>>>>>> >>>>> 10.143.8.182:6827/64670 --> 10.143.8.181:0/12142 --
>>>>>> >>>>> osd_ping(ping_reply
>>>>>> >>>>> e220
>>>>>> >>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf475940 con
>>>>>> >>>>> 0xee886e0
>>>>>> >>>>>    -17> 2014-08-13 17:52:56.594448 7f8a798c7700  1 --
>>>>>> >>>>> 10.141.8.182:6839/64670 >> :/0 pipe(0xec15f00 sd=74 :6839 s=0
>>>>>> >>>>> pgs=0
>>>>>> >>>>> cs=0
>>>>>> >>>>> l=0
>>>>>> >>>>> c=0xee856a0).accept sd=74 10.141.8.180:47641/0
>>>>>> >>>>>    -16> 2014-08-13 17:52:56.594921 7f8a798c7700  1 --
>>>>>> >>>>> 10.141.8.182:6839/64670 <== client.7512 10.141.8.180:0/1018433 1
>>>>>> >>>>> ====
>>>>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+39 (1972163119 0
>>>>>> >>>>> 4174233976) 0xf3bca40 con 0xee856a0
>>>>>> >>>>>    -15> 2014-08-13 17:52:56.594957 7f8a798c7700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 299, time: 2014-08-13 17:52:56.594874, event: header_read, op:
>>>>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>    -14> 2014-08-13 17:52:56.594970 7f8a798c7700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 299, time: 2014-08-13 17:52:56.594880, event: throttled, op:
>>>>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>    -13> 2014-08-13 17:52:56.594978 7f8a798c7700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 299, time: 2014-08-13 17:52:56.594917, event: all_read, op:
>>>>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>    -12> 2014-08-13 17:52:56.594986 7f8a798c7700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 299, time: 0.000000, event: dispatched, op:
>>>>>> >>>>> osd_op(client.7512.0:1
>>>>>> >>>>> [pgls
>>>>>> >>>>> start_epoch 0] 3.0 ack+read+known_if_redirected e220)
>>>>>> >>>>>    -11> 2014-08-13 17:52:56.595127 7f8a90795700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 299, time: 2014-08-13 17:52:56.595104, event: reached_pg, op:
>>>>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>    -10> 2014-08-13 17:52:56.595159 7f8a90795700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 299, time: 2014-08-13 17:52:56.595153, event: started, op:
>>>>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>     -9> 2014-08-13 17:52:56.602179 7f8a90795700  1 --
>>>>>> >>>>> 10.141.8.182:6839/64670 --> 10.141.8.180:0/1018433 --
>>>>>> >>>>> osd_op_reply(1
>>>>>> >>>>> [pgls
>>>>>> >>>>> start_epoch 0] v164'30654 uv30654 ondisk = 0) v6 -- ?+0 0xec16180
>>>>>> >>>>> con
>>>>>> >>>>> 0xee856a0
>>>>>> >>>>>     -8> 2014-08-13 17:52:56.602211 7f8a90795700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 299, time: 2014-08-13 17:52:56.602205, event: done, op:
>>>>>> >>>>> osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>     -7> 2014-08-13 17:52:56.614839 7f8a798c7700  1 --
>>>>>> >>>>> 10.141.8.182:6839/64670 <== client.7512 10.141.8.180:0/1018433 2
>>>>>> >>>>> ====
>>>>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+89 (3460833343 0
>>>>>> >>>>> 2600845095) 0xf3bcec0 con 0xee856a0
>>>>>> >>>>>     -6> 2014-08-13 17:52:56.614864 7f8a798c7700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 300, time: 2014-08-13 17:52:56.614789, event: header_read, op:
>>>>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>     -5> 2014-08-13 17:52:56.614874 7f8a798c7700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 300, time: 2014-08-13 17:52:56.614792, event: throttled, op:
>>>>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>     -4> 2014-08-13 17:52:56.614884 7f8a798c7700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 300, time: 2014-08-13 17:52:56.614835, event: all_read, op:
>>>>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>     -3> 2014-08-13 17:52:56.614891 7f8a798c7700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 300, time: 0.000000, event: dispatched, op:
>>>>>> >>>>> osd_op(client.7512.0:2
>>>>>> >>>>> [pgls
>>>>>> >>>>> start_epoch 220] 3.0 ack+read+known_if_redirected e220)
>>>>>> >>>>>     -2> 2014-08-13 17:52:56.614972 7f8a92f9a700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 300, time: 2014-08-13 17:52:56.614958, event: reached_pg, op:
>>>>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>     -1> 2014-08-13 17:52:56.614993 7f8a92f9a700  5 -- op tracker
>>>>>> >>>>> --
>>>>>> >>>>> ,
>>>>>> >>>>> seq:
>>>>>> >>>>> 300, time: 2014-08-13 17:52:56.614986, event: started, op:
>>>>>> >>>>> osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0
>>>>>> >>>>> ack+read+known_if_redirected e220)
>>>>>> >>>>>      0> 2014-08-13 17:52:56.617087 7f8a92f9a700 -1
>>>>>> >>>>> os/GenericObjectMap.cc:
>>>>>> >>>>> In function 'int GenericObjectMap::list_objects(const coll_t&,
>>>>>> >>>>> ghobject_t,
>>>>>> >>>>> int, std::vector<ghobject_t>*, ghobject_t*)' thread 7f8a92f9a700
>>>>>> >>>>> time
>>>>>> >>>>> 2014-08-13 17:52:56.615073
>>>>>> >>>>> os/GenericObjectMap.cc: 1118: FAILED assert(start <= header.oid)
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>>  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
>>>>>> >>>>>  1: (GenericObjectMap::list_objects(coll_t const&, ghobject_t,
>>>>>> >>>>> int,
>>>>>> >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>> >>>>> ghobject_t*)+0x474)
>>>>>> >>>>> [0x98f774]
>>>>>> >>>>>  2: (KeyValueStore::collection_list_partial(coll_t, ghobject_t,
>>>>>> >>>>> int,
>>>>>> >>>>> int,
>>>>>> >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>> >>>>> ghobject_t*)+0x274) [0x8c5b54]
>>>>>> >>>>>  3: (PGBackend::objects_list_partial(hobject_t const&, int, int,
>>>>>> >>>>> snapid_t,
>>>>>> >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>>>>> >>>>> hobject_t*)+0x1c9)
>>>>>> >>>>> [0x862de9]
>>>>>> >>>>>  4:
>>>>>> >>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5)
>>>>>> >>>>> [0x7f67f5]
>>>>>> >>>>>  5: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3)
>>>>>> >>>>> [0x8177b3]
>>>>>> >>>>>  6: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>>>>>> >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>>>>> >>>>>  7: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>> >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)
>>>>>> >>>>> [0x62bf8d]
>>>>>> >>>>>  8: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>> >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>>>>> >>>>>  9: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>>>>> >>>>> int)+0x8cd)
>>>>>> >>>>> [0xa776fd]
>>>>>> >>>>>  10: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>>>>> >>>>> [0xa79980]
>>>>>> >>>>>  11: (()+0x7df3) [0x7f8aac71fdf3]
>>>>>> >>>>>  12: (clone()+0x6d) [0x7f8aab1963dd]
>>>>>> >>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>`
>>>>>> >>>>> is
>>>>>> >>>>> needed
>>>>>> >>>>> to
>>>>>> >>>>> interpret this.
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>>  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
>>>>>> >>>>>  1: /usr/bin/ceph-osd() [0x99b466]
>>>>>> >>>>>  2: (()+0xf130) [0x7f8aac727130]
>>>>>> >>>>>  3: (gsignal()+0x39) [0x7f8aab0d5989]
>>>>>> >>>>>  4: (abort()+0x148) [0x7f8aab0d7098]
>>>>>> >>>>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x165)
>>>>>> >>>>> [0x7f8aab9e89d5]
>>>>>> >>>>>  6: (()+0x5e946) [0x7f8aab9e6946]
>>>>>> >>>>>  7: (()+0x5e973) [0x7f8aab9e6973]
>>>>>> >>>>>  8: (()+0x5eb9f) [0x7f8aab9e6b9f]
>>>>>> >>>>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>>>> >>>>> const*)+0x1ef) [0xa8805f]
>>>>>> >>>>>  10: (GenericObjectMap::list_objects(coll_t const&, ghobject_t,
>>>>>> >>>>> int,
>>>>>> >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>> >>>>> ghobject_t*)+0x474)
>>>>>> >>>>> [0x98f774]
>>>>>> >>>>>  11: (KeyValueStore::collection_list_partial(coll_t, ghobject_t,
>>>>>> >>>>> int,
>>>>>> >>>>> int,
>>>>>> >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>> >>>>> ghobject_t*)+0x274) [0x8c5b54]
>>>>>> >>>>>  12: (PGBackend::objects_list_partial(hobject_t const&, int, int,
>>>>>> >>>>> snapid_t,
>>>>>> >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>>>>> >>>>> hobject_t*)+0x1c9)
>>>>>> >>>>> [0x862de9]
>>>>>> >>>>>  13:
>>>>>> >>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5)
>>>>>> >>>>> [0x7f67f5]
>>>>>> >>>>>  14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3)
>>>>>> >>>>> [0x8177b3]
>>>>>> >>>>>  15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>>>>>> >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>>>>> >>>>>  16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>> >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)
>>>>>> >>>>> [0x62bf8d]
>>>>>> >>>>>  17: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>> >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>>>>> >>>>>  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>>>>> >>>>> int)+0x8cd)
>>>>>> >>>>> [0xa776fd]
>>>>>> >>>>>  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>>>>> >>>>> [0xa79980]
>>>>>> >>>>>  20: (()+0x7df3) [0x7f8aac71fdf3]
>>>>>> >>>>>  21: (clone()+0x6d) [0x7f8aab1963dd]
>>>>>> >>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>`
>>>>>> >>>>> is
>>>>>> >>>>> needed
>>>>>> >>>>> to
>>>>>> >>>>> interpret this.
>>>>>> >>>>>
>>>>>> >>>>> --- begin dump of recent events ---
>>>>>> >>>>>      0> 2014-08-13 17:52:56.714214 7f8a92f9a700 -1 *** Caught
>>>>>> >>>>> signal
>>>>>> >>>>> (Aborted) **
>>>>>> >>>>>  in thread 7f8a92f9a700
>>>>>> >>>>>
>>>>>> >>>>>  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
>>>>>> >>>>>  1: /usr/bin/ceph-osd() [0x99b466]
>>>>>> >>>>>  2: (()+0xf130) [0x7f8aac727130]
>>>>>> >>>>>  3: (gsignal()+0x39) [0x7f8aab0d5989]
>>>>>> >>>>>  4: (abort()+0x148) [0x7f8aab0d7098]
>>>>>> >>>>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x165)
>>>>>> >>>>> [0x7f8aab9e89d5]
>>>>>> >>>>>  6: (()+0x5e946) [0x7f8aab9e6946]
>>>>>> >>>>>  7: (()+0x5e973) [0x7f8aab9e6973]
>>>>>> >>>>>  8: (()+0x5eb9f) [0x7f8aab9e6b9f]
>>>>>> >>>>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>>>> >>>>> const*)+0x1ef) [0xa8805f]
>>>>>> >>>>>  10: (GenericObjectMap::list_objects(coll_t const&, ghobject_t,
>>>>>> >>>>> int,
>>>>>> >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>> >>>>> ghobject_t*)+0x474)
>>>>>> >>>>> [0x98f774]
>>>>>> >>>>>  11: (KeyValueStore::collection_list_partial(coll_t, ghobject_t,
>>>>>> >>>>> int,
>>>>>> >>>>> int,
>>>>>> >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,
>>>>>> >>>>> ghobject_t*)+0x274) [0x8c5b54]
>>>>>> >>>>>  12: (PGBackend::objects_list_partial(hobject_t const&, int, int,
>>>>>> >>>>> snapid_t,
>>>>>> >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*,
>>>>>> >>>>> hobject_t*)+0x1c9)
>>>>>> >>>>> [0x862de9]
>>>>>> >>>>>  13:
>>>>>> >>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5)
>>>>>> >>>>> [0x7f67f5]
>>>>>> >>>>>  14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3)
>>>>>> >>>>> [0x8177b3]
>>>>>> >>>>>  15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
>>>>>> >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
>>>>>> >>>>>  16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>>>>>> >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)
>>>>>> >>>>> [0x62bf8d]
>>>>>> >>>>>  17: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>> >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
>>>>>> >>>>>  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>>>>> >>>>> int)+0x8cd)
>>>>>> >>>>> [0xa776fd]
>>>>>> >>>>>  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>>>>> >>>>> [0xa79980]
>>>>>> >>>>>  20: (()+0x7df3) [0x7f8aac71fdf3]
>>>>>> >>>>>  21: (clone()+0x6d) [0x7f8aab1963dd]
>>>>>> >>>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>`
>>>>>> >>>>> is
>>>>>> >>>>> needed
>>>>>> >>>>> to
>>>>>> >>>>> interpret this.
>>>>>> >>>>>
>>>>>> >>>>> I guess this has something to do with using the dev
>>>>>> >>>>> Keyvaluestore?
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> Thanks!
>>>>>> >>>>>
>>>>>> >>>>> Kenneth
>>>>>> >>>>>
>>>>>> >>>>> _______________________________________________
>>>>>> >>>>> ceph-users mailing list
>>>>>> >>>>> ceph-users at lists.ceph.com
>>>>>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>> --
>>>>>> >>>> Best Regards,
>>>>>> >>>>
>>>>>> >>>> Wheat
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>>
>>>>>> >>> Met vriendelijke groeten,
>>>>>> >>> Kenneth Waegeman
>>>>>> >>>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >> Best Regards,
>>>>>> >>
>>>>>> >> Wheat
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Met vriendelijke groeten,
>>>>>> > Kenneth Waegeman
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>>
>>>>>> Wheat
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users at lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>>
>>>>
>>>>
>>>> ----- End message from Sage Weil <sweil at redhat.com> -----
>>>>
>>>>
>>>> --
>>>>
>>>> Met vriendelijke groeten,
>>>> Kenneth Waegeman
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>> Wheat
>>
>>
>>
>> ----- End message from Haomai Wang <haomaiwang at gmail.com> -----
>>
>> --
>>
>> Met vriendelijke groeten,
>> Kenneth Waegeman
>>
>>
>
>
>
> --
> Best Regards,
>
> Wheat


----- End message from Haomai Wang <haomaiwang at gmail.com> -----

-- 

Met vriendelijke groeten,
Kenneth Waegeman

-------------- next part --------------
A non-text attachment was scrubbed...
Name: os_5560_object789734
Type: application/octet-stream
Size: 98984 bytes
Desc: not available
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140819/d055fdda/attachment.obj>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux