Hi, I reinstalled the cluster with 0.84, and tried again running rados bench on a EC coded pool on keyvaluestore. Nothing crashed this time, but when I check the status: health HEALTH_ERR 128 pgs inconsistent; 128 scrub errors; too few pgs per osd (15 < min 20) monmap e1: 3 mons at {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, election epoch 8, quorum 0,1,2 ceph001,ceph002,ceph003 osdmap e174: 78 osds: 78 up, 78 in pgmap v147680: 1216 pgs, 3 pools, 14758 GB data, 3690 kobjects 1753 GB used, 129 TB / 131 TB avail 1088 active+clean 128 active+clean+inconsistent the 128 inconsistent pgs are ALL the pgs of the EC KV store ( the others are on Filestore) The only thing I can see in the logs is that after the rados tests, it start scrubbing, and for each KV pg I get something like this: 2014-08-31 11:14:09.050747 osd.11 10.141.8.180:6833/61098 4 : [ERR] 2.3s0 scrub stat mismatch, got 28164/29291 objects, 0/0 clones, 28164/29291 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts, 118128377856/122855358464 bytes. What could here be the problem? Thanks again!! Kenneth ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- Date: Tue, 26 Aug 2014 17:11:43 +0800 From: Haomai Wang <haomaiwang at gmail.com> Subject: Re: ceph cluster inconsistency? To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> Cc: ceph-users at lists.ceph.com > Hmm, it looks like you hit this bug(http://tracker.ceph.com/issues/9223). > > Sorry for the late message, I forget that this fix is merged into 0.84. > > Thanks for your patient :-) > > On Tue, Aug 26, 2014 at 4:39 PM, Kenneth Waegeman > <Kenneth.Waegeman at ugent.be> wrote: >> >> Hi, >> >> In the meantime I already tried with upgrading the cluster to 0.84, to see >> if that made a difference, and it seems it does. >> I can't reproduce the crashing osds by doing a 'rados -p ecdata ls' anymore. >> >> But now the cluster detect it is inconsistent: >> >> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d >> health HEALTH_ERR 40 pgs inconsistent; 40 scrub errors; too few pgs >> per osd (4 < min 20); mon.ceph002 low disk space >> monmap e3: 3 mons at >> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, >> election epoch 30, quorum 0,1,2 ceph001,ceph002,ceph003 >> mdsmap e78951: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3 up:standby >> osdmap e145384: 78 osds: 78 up, 78 in >> pgmap v247095: 320 pgs, 4 pools, 15366 GB data, 3841 kobjects >> 1502 GB used, 129 TB / 131 TB avail >> 279 active+clean >> 40 active+clean+inconsistent >> 1 active+clean+scrubbing+deep >> >> >> I tried to do ceph pg repair for all the inconsistent pgs: >> >> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d >> health HEALTH_ERR 40 pgs inconsistent; 1 pgs repair; 40 scrub errors; >> too few pgs per osd (4 < min 20); mon.ceph002 low disk space >> monmap e3: 3 mons at >> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, >> election epoch 30, quorum 0,1,2 ceph001,ceph002,ceph003 >> mdsmap e79486: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3 up:standby >> osdmap e146452: 78 osds: 78 up, 78 in >> pgmap v248520: 320 pgs, 4 pools, 15366 GB data, 3841 kobjects >> 1503 GB used, 129 TB / 131 TB avail >> 279 active+clean >> 39 active+clean+inconsistent >> 1 active+clean+scrubbing+deep >> 1 active+clean+scrubbing+deep+inconsistent+repair >> >> I let it recovering through the night, but this morning the mons were all >> gone, nothing to see in the log files.. The osds were all still up! >> >> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d >> health HEALTH_ERR 36 pgs inconsistent; 1 pgs repair; 36 scrub errors; >> too few pgs per osd (4 < min 20) >> monmap e7: 3 mons at >> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, >> election epoch 44, quorum 0,1,2 ceph001,ceph002,ceph003 >> mdsmap e109481: 1/1/1 up {0=ceph003.cubone.os=up:active}, 3 up:standby >> osdmap e203410: 78 osds: 78 up, 78 in >> pgmap v331747: 320 pgs, 4 pools, 15251 GB data, 3812 kobjects >> 1547 GB used, 129 TB / 131 TB avail >> 1 active+clean+scrubbing+deep+inconsistent+repair >> 284 active+clean >> 35 active+clean+inconsistent >> >> I restarted the monitors now, I will let you know when I see something >> more.. >> >> >> >> >> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- >> Date: Sun, 24 Aug 2014 12:51:41 +0800 >> >> From: Haomai Wang <haomaiwang at gmail.com> >> Subject: Re: ceph cluster inconsistency? >> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>, >> ceph-users at lists.ceph.com >> >> >>> It's really strange! I write a test program according the key ordering >>> you provided and parse the corresponding value. It's true! >>> >>> I have no idea now. If free, could you add this debug code to >>> "src/os/GenericObjectMap.cc" and insert *before* "assert(start <= >>> header.oid);": >>> >>> dout(0) << "start: " << start << "header.oid: " << header.oid << dendl; >>> >>> Then you need to recompile ceph-osd and run it again. The output log >>> can help it! >>> >>> On Tue, Aug 19, 2014 at 10:19 PM, Haomai Wang <haomaiwang at gmail.com> >>> wrote: >>>> >>>> I feel a little embarrassed, 1024 rows still true for me. >>>> >>>> I was wondering if you could give your all keys via >>>> ""ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list >>>> _GHOBJTOSEQ_ > keys.log?. >>>> >>>> thanks! >>>> >>>> On Tue, Aug 19, 2014 at 4:58 PM, Kenneth Waegeman >>>> <Kenneth.Waegeman at ugent.be> wrote: >>>>> >>>>> >>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- >>>>> Date: Tue, 19 Aug 2014 12:28:27 +0800 >>>>> >>>>> From: Haomai Wang <haomaiwang at gmail.com> >>>>> Subject: Re: ceph cluster inconsistency? >>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> >>>>> Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com >>>>> >>>>> >>>>>> On Mon, Aug 18, 2014 at 7:32 PM, Kenneth Waegeman >>>>>> <Kenneth.Waegeman at ugent.be> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- >>>>>>> Date: Mon, 18 Aug 2014 18:34:11 +0800 >>>>>>> >>>>>>> From: Haomai Wang <haomaiwang at gmail.com> >>>>>>> Subject: Re: ceph cluster inconsistency? >>>>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> >>>>>>> Cc: Sage Weil <sweil at redhat.com>, ceph-users at lists.ceph.com >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Mon, Aug 18, 2014 at 5:38 PM, Kenneth Waegeman >>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I tried this after restarting the osd, but I guess that was not the >>>>>>>>> aim >>>>>>>>> ( >>>>>>>>> # ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list >>>>>>>>> _GHOBJTOSEQ_| >>>>>>>>> grep 6adb1100 -A 100 >>>>>>>>> IO error: lock /var/lib/ceph/osd/ceph-67/current//LOCK: Resource >>>>>>>>> temporarily >>>>>>>>> unavailable >>>>>>>>> tools/ceph_kvstore_tool.cc: In function 'StoreTool::StoreTool(const >>>>>>>>> string&)' thread 7f8fecf7d780 time 2014-08-18 11:12:29.551780 >>>>>>>>> tools/ceph_kvstore_tool.cc: 38: FAILED >>>>>>>>> assert(!db_ptr->open(std::cerr)) >>>>>>>>> .. >>>>>>>>> ) >>>>>>>>> >>>>>>>>> When I run it after bringing the osd down, it takes a while, but it >>>>>>>>> has >>>>>>>>> no >>>>>>>>> output.. (When running it without the grep, I'm getting a huge list >>>>>>>>> ) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Oh, sorry for it! I made a mistake, the hash value(6adb1100) will be >>>>>>>> reversed into leveldb. >>>>>>>> So grep "benchmark_data_ceph001.cubone.os_5560_object789734" should >>>>>>>> be >>>>>>>> help it. >>>>>>>> >>>>>>> this gives: >>>>>>> >>>>>>> [root at ceph003 ~]# ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ >>>>>>> list >>>>>>> _GHOBJTOSEQ_ | grep 5560_object789734 -A 100 >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011BDA6!!3!!benchmark_data_ceph001%ecubone%eos_5560_object789734!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011C027!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1330170!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011C6FD!!3!!benchmark_data_ceph001%ecubone%eos_4919_object227366!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011CB03!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1363631!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011CDF0!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1573957!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011D02C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1019282!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E2B5!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1283563!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E511!!3!!benchmark_data_ceph001%ecubone%eos_4919_object273736!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011E547!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1170628!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011EAAB!!3!!benchmark_data_ceph001%ecubone%eos_4919_object256335!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011F446!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1484196!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0011FC59!!3!!benchmark_data_ceph001%ecubone%eos_5560_object884178!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001203F3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object853746!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001208E3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object36633!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00120B37!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1235337!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001210B6!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1661351!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001210CB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object238126!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012184C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object339943!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00121916!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1047094!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001219C1!!3!!benchmark_data_ceph001%ecubone%eos_31461_object520642!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001222BB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object639565!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001223AA!!3!!benchmark_data_ceph001%ecubone%eos_4919_object231080!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012243C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object858050!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012289C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object241796!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122D28!!3!!benchmark_data_ceph001%ecubone%eos_4919_object7462!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122DFE!!3!!benchmark_data_ceph001%ecubone%eos_5560_object243798!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00122EFC!!3!!benchmark_data_ceph001%ecubone%eos_8961_object109512!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001232D7!!3!!benchmark_data_ceph001%ecubone%eos_31461_object653973!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001234A3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1378169!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123714!!3!!benchmark_data_ceph001%ecubone%eos_5560_object512925!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001237D9!!3!!benchmark_data_ceph001%ecubone%eos_4919_object23289!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123854!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1108852!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123971!!3!!benchmark_data_ceph001%ecubone%eos_5560_object704026!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00123F75!!3!!benchmark_data_ceph001%ecubone%eos_8961_object250441!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124083!!3!!benchmark_data_ceph001%ecubone%eos_31461_object706178!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001240FA!!3!!benchmark_data_ceph001%ecubone%eos_5560_object316952!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012447D!!3!!benchmark_data_ceph001%ecubone%eos_5560_object538734!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001244D9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object789215!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001247CD!!3!!benchmark_data_ceph001%ecubone%eos_8961_object265993!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124897!!3!!benchmark_data_ceph001%ecubone%eos_31461_object610597!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124BE4!!3!!benchmark_data_ceph001%ecubone%eos_31461_object691723!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124C9B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1306135!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00124E1D!!3!!benchmark_data_ceph001%ecubone%eos_5560_object520580!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012534C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object659767!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00125A81!!3!!benchmark_data_ceph001%ecubone%eos_5560_object184060!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00125E77!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1292867!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00126562!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1201410!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00126B34!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1657326!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127383!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1269787!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127396!!3!!benchmark_data_ceph001%ecubone%eos_31461_object500115!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001277F8!!3!!benchmark_data_ceph001%ecubone%eos_31461_object394932!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001279DD!!3!!benchmark_data_ceph001%ecubone%eos_4919_object252963!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127B40!!3!!benchmark_data_ceph001%ecubone%eos_31461_object936811!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00127BAC!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1481773!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012894E!!3!!benchmark_data_ceph001%ecubone%eos_5560_object999885!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00128D05!!3!!benchmark_data_ceph001%ecubone%eos_31461_object943667!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012908A!!3!!benchmark_data_ceph001%ecubone%eos_5560_object212990!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129519!!3!!benchmark_data_ceph001%ecubone%eos_5560_object437596!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129716!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1585330!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129798!!3!!benchmark_data_ceph001%ecubone%eos_5560_object603505!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001299C9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object808800!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129B7A!!3!!benchmark_data_ceph001%ecubone%eos_31461_object23193!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00129B9A!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1158397!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012A932!!3!!benchmark_data_ceph001%ecubone%eos_5560_object542450!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012B77A!!3!!benchmark_data_ceph001%ecubone%eos_8961_object195480!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012BE8C!!3!!benchmark_data_ceph001%ecubone%eos_4919_object312911!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012BF74!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1563783!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012C65C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1123980!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012C6FE!!3!!benchmark_data_ceph001%ecubone%eos_3411_object913!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012CCAD!!3!!benchmark_data_ceph001%ecubone%eos_31461_object400863!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012CDBB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object789667!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012D14B!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1020723!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012D95B!!3!!benchmark_data_ceph001%ecubone%eos_8961_object106293!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012E3C8!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1355526!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012E5B3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1491348!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012F2BB!!3!!benchmark_data_ceph001%ecubone%eos_8961_object338872!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012F374!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1337264!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FBE5!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1512395!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FCE3!!3!!benchmark_data_ceph001%ecubone%eos_8961_object298610!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0012FEB6!!3!!benchmark_data_ceph001%ecubone%eos_4919_object120824!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001301CA!!3!!benchmark_data_ceph001%ecubone%eos_5560_object816326!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00130263!!3!!benchmark_data_ceph001%ecubone%eos_5560_object777163!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00130529!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1413173!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001317D9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object809510!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0013204F!!3!!benchmark_data_ceph001%ecubone%eos_31461_object471416!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132400!!3!!benchmark_data_ceph001%ecubone%eos_5560_object695087!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132A19!!3!!benchmark_data_ceph001%ecubone%eos_31461_object591945!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132BF8!!3!!benchmark_data_ceph001%ecubone%eos_31461_object302000!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00132F5B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1645443!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00133B8B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object761911!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!0013433E!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1467727!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134446!!3!!benchmark_data_ceph001%ecubone%eos_31461_object791960!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134678!!3!!benchmark_data_ceph001%ecubone%eos_31461_object677078!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00134A96!!3!!benchmark_data_ceph001%ecubone%eos_31461_object254923!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!001355D0!!3!!benchmark_data_ceph001%ecubone%eos_31461_object321528!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135690!!3!!benchmark_data_ceph001%ecubone%eos_4919_object36935!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135B62!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1228272!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135C72!!3!!benchmark_data_ceph001%ecubone%eos_4812_object2180!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00135DEE!!3!!benchmark_data_ceph001%ecubone%eos_5560_object425705!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00136366!!3!!benchmark_data_ceph001%ecubone%eos_5560_object141569!head >>>>>>> >>>>>>> >>>>>>> _GHOBJTOSEQ_:3%e0s0_head!00136371!!3!!benchmark_data_ceph001%ecubone%eos_5560_object564213!head >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> 100 rows seemed true for me. I found the min list objects is 1024. >>>>>> Please could you run >>>>>> "ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list >>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 1024" >>>>> >>>>> >>>>> >>>>> I got the output in attachment >>>>> >>>>> >>>>>> >>>>>>>>> >>>>>>>>> Or should I run this immediately after the osd is crashed, (because >>>>>>>>> it >>>>>>>>> maybe >>>>>>>>> rebalanced? I did already restarted the cluster) >>>>>>>>> >>>>>>>>> >>>>>>>>> I don't know if it is related, but before I could all do that, I had >>>>>>>>> to >>>>>>>>> fix >>>>>>>>> something else: A monitor did run out if disk space, using 8GB for >>>>>>>>> his >>>>>>>>> store.db folder (lot of sst files). Other monitors are also near >>>>>>>>> that >>>>>>>>> level. >>>>>>>>> Never had that problem on previous setups before. I recreated a >>>>>>>>> monitor >>>>>>>>> and >>>>>>>>> now it uses 3.8GB. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> It exists some duplicate data which needed to be compacted. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> Another idea, maybe you can make KeyValueStore's stripe size align >>>>>>>> with EC stripe size. >>>>>>> >>>>>>> >>>>>>> >>>>>>> How can I do that? Is there some documentation about that? >>>>>> >>>>>> >>>>>> >>>>>>> ceph --show-config | grep keyvaluestore >>>>>> >>>>>> >>>>>> debug_keyvaluestore = 0/0 >>>>>> keyvaluestore_queue_max_ops = 50 >>>>>> keyvaluestore_queue_max_bytes = 104857600 >>>>>> keyvaluestore_debug_check_backend = false >>>>>> keyvaluestore_op_threads = 2 >>>>>> keyvaluestore_op_thread_timeout = 60 >>>>>> keyvaluestore_op_thread_suicide_timeout = 180 >>>>>> keyvaluestore_default_strip_size = 4096 >>>>>> keyvaluestore_max_expected_write_size = 16777216 >>>>>> keyvaluestore_header_cache_size = 4096 >>>>>> keyvaluestore_backend = leveldb >>>>>> >>>>>> keyvaluestore_default_strip_size is the wanted >>>>>> >>>>>>> >>>>>>> >>>>>>>> I haven't think deeply and maybe I will try it later. >>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> Kenneth >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ----- Message from Sage Weil <sweil at redhat.com> --------- >>>>>>>>> Date: Fri, 15 Aug 2014 06:10:34 -0700 (PDT) >>>>>>>>> From: Sage Weil <sweil at redhat.com> >>>>>>>>> >>>>>>>>> Subject: Re: ceph cluster inconsistency? >>>>>>>>> To: Haomai Wang <haomaiwang at gmail.com> >>>>>>>>> Cc: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>, >>>>>>>>> ceph-users at lists.ceph.com >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Fri, 15 Aug 2014, Haomai Wang wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Kenneth, >>>>>>>>>>> >>>>>>>>>>> I don't find valuable info in your logs, it lack of the necessary >>>>>>>>>>> debug output when accessing crash code. >>>>>>>>>>> >>>>>>>>>>> But I scan the encode/decode implementation in GenericObjectMap >>>>>>>>>>> and >>>>>>>>>>> find something bad. >>>>>>>>>>> >>>>>>>>>>> For example, two oid has same hash and their name is: >>>>>>>>>>> A: "rb.data.123" >>>>>>>>>>> B: "rb-123" >>>>>>>>>>> >>>>>>>>>>> In ghobject_t compare level, A < B. But GenericObjectMap encode >>>>>>>>>>> "." >>>>>>>>>>> to >>>>>>>>>>> "%e", so the key in DB is: >>>>>>>>>>> A: _GHOBJTOSEQ_:blah!51615000!!none!!rb%edata%e123!head >>>>>>>>>>> B: _GHOBJTOSEQ_:blah!51615000!!none!!rb-123!head >>>>>>>>>>> >>>>>>>>>>> A > B >>>>>>>>>>> >>>>>>>>>>> And it seemed that the escape function is useless and should be >>>>>>>>>>> disabled. >>>>>>>>>>> >>>>>>>>>>> I'm not sure whether Kenneth's problem is touching this bug. >>>>>>>>>>> Because >>>>>>>>>>> this scene only occur when the object set is very large and make >>>>>>>>>>> the >>>>>>>>>>> two object has same hash value. >>>>>>>>>>> >>>>>>>>>>> Kenneth, could you have time to run "ceph-kv-store [path-to-osd] >>>>>>>>>>> list >>>>>>>>>>> _GHOBJTOSEQ_| grep 6adb1100 -A 100". ceph-kv-store is a debug tool >>>>>>>>>>> which can be compiled from source. You can clone ceph repo and run >>>>>>>>>>> "./authongen.sh; ./configure; cd src; make ceph-kvstore-tool". >>>>>>>>>>> "path-to-osd" should be "/var/lib/ceph/osd-[id]/current/". >>>>>>>>>>> "6adb1100" >>>>>>>>>>> is from your verbose log and the next 100 rows should know >>>>>>>>>>> necessary >>>>>>>>>>> infos. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> You can also get ceph-kvstore-tool from the 'ceph-tests' package. >>>>>>>>>> >>>>>>>>>>> Hi sage, do you think we need to provided with upgrade function to >>>>>>>>>>> fix >>>>>>>>>>> it? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hmm, we might. This only affects the key/value encoding right? >>>>>>>>>> The >>>>>>>>>> FileStore is using its own function to map these to file names? >>>>>>>>>> >>>>>>>>>> Can you open a ticket in the tracker for this? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> sage >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Aug 14, 2014 at 7:36 PM, Kenneth Waegeman >>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- >>>>>>>>>>>> Date: Thu, 14 Aug 2014 19:11:55 +0800 >>>>>>>>>>>> >>>>>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com> >>>>>>>>>>>> Subject: Re: ceph cluster inconsistency? >>>>>>>>>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Could you add config "debug_keyvaluestore = 20/20" to the >>>>>>>>>>>>> crashed >>>>>>>>>>>>> osd >>>>>>>>>>>>> and replay the command causing crash? >>>>>>>>>>>>> >>>>>>>>>>>>> I would like to get more debug infos! Thanks. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I included the log in attachment! >>>>>>>>>>>> Thanks! >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Aug 14, 2014 at 4:41 PM, Kenneth Waegeman >>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have: >>>>>>>>>>>>>> osd_objectstore = keyvaluestore-dev >>>>>>>>>>>>>> >>>>>>>>>>>>>> in the global section of my ceph.conf >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> [root at ceph002 ~]# ceph osd erasure-code-profile get profile11 >>>>>>>>>>>>>> directory=/usr/lib64/ceph/erasure-code >>>>>>>>>>>>>> k=8 >>>>>>>>>>>>>> m=3 >>>>>>>>>>>>>> plugin=jerasure >>>>>>>>>>>>>> ruleset-failure-domain=osd >>>>>>>>>>>>>> technique=reed_sol_van >>>>>>>>>>>>>> >>>>>>>>>>>>>> the ecdata pool has this as profile >>>>>>>>>>>>>> >>>>>>>>>>>>>> pool 3 'ecdata' erasure size 11 min_size 8 crush_ruleset 2 >>>>>>>>>>>>>> object_hash >>>>>>>>>>>>>> rjenkins pg_num 128 pgp_num 128 last_change 161 flags >>>>>>>>>>>>>> hashpspool >>>>>>>>>>>>>> stripe_width 4096 >>>>>>>>>>>>>> >>>>>>>>>>>>>> ECrule in crushmap >>>>>>>>>>>>>> >>>>>>>>>>>>>> rule ecdata { >>>>>>>>>>>>>> ruleset 2 >>>>>>>>>>>>>> type erasure >>>>>>>>>>>>>> min_size 3 >>>>>>>>>>>>>> max_size 20 >>>>>>>>>>>>>> step set_chooseleaf_tries 5 >>>>>>>>>>>>>> step take default-ec >>>>>>>>>>>>>> step choose indep 0 type osd >>>>>>>>>>>>>> step emit >>>>>>>>>>>>>> } >>>>>>>>>>>>>> root default-ec { >>>>>>>>>>>>>> id -8 # do not change unnecessarily >>>>>>>>>>>>>> # weight 140.616 >>>>>>>>>>>>>> alg straw >>>>>>>>>>>>>> hash 0 # rjenkins1 >>>>>>>>>>>>>> item ceph001-ec weight 46.872 >>>>>>>>>>>>>> item ceph002-ec weight 46.872 >>>>>>>>>>>>>> item ceph003-ec weight 46.872 >>>>>>>>>>>>>> ... >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers! >>>>>>>>>>>>>> Kenneth >>>>>>>>>>>>>> >>>>>>>>>>>>>> ----- Message from Haomai Wang <haomaiwang at gmail.com> --------- >>>>>>>>>>>>>> Date: Thu, 14 Aug 2014 10:07:50 +0800 >>>>>>>>>>>>>> From: Haomai Wang <haomaiwang at gmail.com> >>>>>>>>>>>>>> Subject: Re: ceph cluster inconsistency? >>>>>>>>>>>>>> To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> >>>>>>>>>>>>>> Cc: ceph-users <ceph-users at lists.ceph.com> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Kenneth, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Could you give your configuration related to EC and >>>>>>>>>>>>>>> KeyValueStore? >>>>>>>>>>>>>>> Not sure whether it's bug on KeyValueStore >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 12:06 AM, Kenneth Waegeman >>>>>>>>>>>>>>> <Kenneth.Waegeman at ugent.be> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I was doing some tests with rados bench on a Erasure Coded >>>>>>>>>>>>>>>> pool >>>>>>>>>>>>>>>> (using >>>>>>>>>>>>>>>> keyvaluestore-dev objectstore) on 0.83, and I see some >>>>>>>>>>>>>>>> strangs >>>>>>>>>>>>>>>> things: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [root at ceph001 ~]# ceph status >>>>>>>>>>>>>>>> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d >>>>>>>>>>>>>>>> health HEALTH_WARN too few pgs per osd (4 < min 20) >>>>>>>>>>>>>>>> monmap e1: 3 mons at >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, >>>>>>>>>>>>>>>> election epoch 6, quorum 0,1,2 ceph001,ceph002,ceph003 >>>>>>>>>>>>>>>> mdsmap e116: 1/1/1 up {0=ceph001.cubone.os=up:active}, 2 >>>>>>>>>>>>>>>> up:standby >>>>>>>>>>>>>>>> osdmap e292: 78 osds: 78 up, 78 in >>>>>>>>>>>>>>>> pgmap v48873: 320 pgs, 4 pools, 15366 GB data, 3841 >>>>>>>>>>>>>>>> kobjects >>>>>>>>>>>>>>>> 1381 GB used, 129 TB / 131 TB avail >>>>>>>>>>>>>>>> 320 active+clean >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> There is around 15T of data, but only 1.3 T usage. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is also visible in rados: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [root at ceph001 ~]# rados df >>>>>>>>>>>>>>>> pool name category KB objects >>>>>>>>>>>>>>>> clones >>>>>>>>>>>>>>>> degraded unfound rd rd KB wr >>>>>>>>>>>>>>>> wr >>>>>>>>>>>>>>>> KB >>>>>>>>>>>>>>>> data - 0 0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0 0 0 0 0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ecdata - 16113451009 3933959 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0 0 1 1 3935632 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 16116850711 >>>>>>>>>>>>>>>> metadata - 2 20 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0 0 33 36 21 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 8 >>>>>>>>>>>>>>>> rbd - 0 0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0 0 0 0 0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> total used 1448266016 3933979 >>>>>>>>>>>>>>>> total avail 139400181016 >>>>>>>>>>>>>>>> total space 140848447032 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Another (related?) thing: if I do rados -p ecdata ls, I >>>>>>>>>>>>>>>> trigger >>>>>>>>>>>>>>>> osd >>>>>>>>>>>>>>>> shutdowns (each time): >>>>>>>>>>>>>>>> I get a list followed by an error: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_8961_object243839 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object801983 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object856489 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_8961_object202232 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_4919_object33199 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object807797 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_4919_object74729 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object1264121 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object1318513 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object1202111 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object939107 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object729682 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object122915 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object76521 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object113261 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_31461_object575079 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object671042 >>>>>>>>>>>>>>>> benchmark_data_ceph001.cubone.os_5560_object381146 >>>>>>>>>>>>>>>> 2014-08-13 17:57:48.736150 7f65047b5700 0 -- >>>>>>>>>>>>>>>> 10.141.8.180:0/1023295 >> >>>>>>>>>>>>>>>> 10.141.8.182:6839/4471 pipe(0x7f64fc019b20 sd=5 :0 s=1 pgs=0 >>>>>>>>>>>>>>>> cs=0 >>>>>>>>>>>>>>>> l=1 >>>>>>>>>>>>>>>> c=0x7f64fc019db0).fault >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And I can see this in the log files: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -25> 2014-08-13 17:52:56.323908 7f8a97fa4700 1 -- >>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 <== osd.57 10.141.8.182:0/15796 51 >>>>>>>>>>>>>>>> ==== >>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ==== >>>>>>>>>>>>>>>> 47+0+0 >>>>>>>>>>>>>>>> (3227325175 0 0) 0xf475940 con 0xee89fa0 >>>>>>>>>>>>>>>> -24> 2014-08-13 17:52:56.323938 7f8a97fa4700 1 -- >>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 --> 10.141.8.182:0/15796 -- >>>>>>>>>>>>>>>> osd_ping(ping_reply >>>>>>>>>>>>>>>> e220 >>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf815b00 con >>>>>>>>>>>>>>>> 0xee89fa0 >>>>>>>>>>>>>>>> -23> 2014-08-13 17:52:56.324078 7f8a997a7700 1 -- >>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 <== osd.57 10.141.8.182:0/15796 51 >>>>>>>>>>>>>>>> ==== >>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ==== >>>>>>>>>>>>>>>> 47+0+0 >>>>>>>>>>>>>>>> (3227325175 0 0) 0xf132bc0 con 0xee8a680 >>>>>>>>>>>>>>>> -22> 2014-08-13 17:52:56.324111 7f8a997a7700 1 -- >>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 --> 10.141.8.182:0/15796 -- >>>>>>>>>>>>>>>> osd_ping(ping_reply >>>>>>>>>>>>>>>> e220 >>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf811a40 con >>>>>>>>>>>>>>>> 0xee8a680 >>>>>>>>>>>>>>>> -21> 2014-08-13 17:52:56.584461 7f8a997a7700 1 -- >>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 <== osd.29 10.143.8.181:0/12142 47 >>>>>>>>>>>>>>>> ==== >>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ==== >>>>>>>>>>>>>>>> 47+0+0 >>>>>>>>>>>>>>>> (3355887204 0 0) 0xf655940 con 0xee88b00 >>>>>>>>>>>>>>>> -20> 2014-08-13 17:52:56.584486 7f8a997a7700 1 -- >>>>>>>>>>>>>>>> 10.141.8.182:6840/64670 --> 10.143.8.181:0/12142 -- >>>>>>>>>>>>>>>> osd_ping(ping_reply >>>>>>>>>>>>>>>> e220 >>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf132bc0 con >>>>>>>>>>>>>>>> 0xee88b00 >>>>>>>>>>>>>>>> -19> 2014-08-13 17:52:56.584498 7f8a97fa4700 1 -- >>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 <== osd.29 10.143.8.181:0/12142 47 >>>>>>>>>>>>>>>> ==== >>>>>>>>>>>>>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ==== >>>>>>>>>>>>>>>> 47+0+0 >>>>>>>>>>>>>>>> (3355887204 0 0) 0xf20e040 con 0xee886e0 >>>>>>>>>>>>>>>> -18> 2014-08-13 17:52:56.584526 7f8a97fa4700 1 -- >>>>>>>>>>>>>>>> 10.143.8.182:6827/64670 --> 10.143.8.181:0/12142 -- >>>>>>>>>>>>>>>> osd_ping(ping_reply >>>>>>>>>>>>>>>> e220 >>>>>>>>>>>>>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf475940 con >>>>>>>>>>>>>>>> 0xee886e0 >>>>>>>>>>>>>>>> -17> 2014-08-13 17:52:56.594448 7f8a798c7700 1 -- >>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 >> :/0 pipe(0xec15f00 sd=74 :6839 s=0 >>>>>>>>>>>>>>>> pgs=0 >>>>>>>>>>>>>>>> cs=0 >>>>>>>>>>>>>>>> l=0 >>>>>>>>>>>>>>>> c=0xee856a0).accept sd=74 10.141.8.180:47641/0 >>>>>>>>>>>>>>>> -16> 2014-08-13 17:52:56.594921 7f8a798c7700 1 -- >>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 <== client.7512 >>>>>>>>>>>>>>>> 10.141.8.180:0/1018433 >>>>>>>>>>>>>>>> 1 >>>>>>>>>>>>>>>> ==== >>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+39 >>>>>>>>>>>>>>>> (1972163119 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 4174233976) 0xf3bca40 con 0xee856a0 >>>>>>>>>>>>>>>> -15> 2014-08-13 17:52:56.594957 7f8a798c7700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594874, event: header_read, >>>>>>>>>>>>>>>> op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -14> 2014-08-13 17:52:56.594970 7f8a798c7700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594880, event: throttled, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -13> 2014-08-13 17:52:56.594978 7f8a798c7700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.594917, event: all_read, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -12> 2014-08-13 17:52:56.594986 7f8a798c7700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 299, time: 0.000000, event: dispatched, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:1 >>>>>>>>>>>>>>>> [pgls >>>>>>>>>>>>>>>> start_epoch 0] 3.0 ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -11> 2014-08-13 17:52:56.595127 7f8a90795700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.595104, event: reached_pg, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -10> 2014-08-13 17:52:56.595159 7f8a90795700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.595153, event: started, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -9> 2014-08-13 17:52:56.602179 7f8a90795700 1 -- >>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 --> 10.141.8.180:0/1018433 -- >>>>>>>>>>>>>>>> osd_op_reply(1 >>>>>>>>>>>>>>>> [pgls >>>>>>>>>>>>>>>> start_epoch 0] v164'30654 uv30654 ondisk = 0) v6 -- ?+0 >>>>>>>>>>>>>>>> 0xec16180 >>>>>>>>>>>>>>>> con >>>>>>>>>>>>>>>> 0xee856a0 >>>>>>>>>>>>>>>> -8> 2014-08-13 17:52:56.602211 7f8a90795700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 299, time: 2014-08-13 17:52:56.602205, event: done, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -7> 2014-08-13 17:52:56.614839 7f8a798c7700 1 -- >>>>>>>>>>>>>>>> 10.141.8.182:6839/64670 <== client.7512 >>>>>>>>>>>>>>>> 10.141.8.180:0/1018433 >>>>>>>>>>>>>>>> 2 >>>>>>>>>>>>>>>> ==== >>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+89 >>>>>>>>>>>>>>>> (3460833343 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2600845095) 0xf3bcec0 con 0xee856a0 >>>>>>>>>>>>>>>> -6> 2014-08-13 17:52:56.614864 7f8a798c7700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614789, event: header_read, >>>>>>>>>>>>>>>> op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -5> 2014-08-13 17:52:56.614874 7f8a798c7700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614792, event: throttled, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -4> 2014-08-13 17:52:56.614884 7f8a798c7700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614835, event: all_read, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -3> 2014-08-13 17:52:56.614891 7f8a798c7700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 300, time: 0.000000, event: dispatched, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:2 >>>>>>>>>>>>>>>> [pgls >>>>>>>>>>>>>>>> start_epoch 220] 3.0 ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -2> 2014-08-13 17:52:56.614972 7f8a92f9a700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614958, event: reached_pg, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> -1> 2014-08-13 17:52:56.614993 7f8a92f9a700 5 -- op >>>>>>>>>>>>>>>> tracker >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> , >>>>>>>>>>>>>>>> seq: >>>>>>>>>>>>>>>> 300, time: 2014-08-13 17:52:56.614986, event: started, op: >>>>>>>>>>>>>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>>>>>>>>>>>>> ack+read+known_if_redirected e220) >>>>>>>>>>>>>>>> 0> 2014-08-13 17:52:56.617087 7f8a92f9a700 -1 >>>>>>>>>>>>>>>> os/GenericObjectMap.cc: >>>>>>>>>>>>>>>> In function 'int GenericObjectMap::list_objects(const >>>>>>>>>>>>>>>> coll_t&, >>>>>>>>>>>>>>>> ghobject_t, >>>>>>>>>>>>>>>> int, std::vector<ghobject_t>*, ghobject_t*)' thread >>>>>>>>>>>>>>>> 7f8a92f9a700 >>>>>>>>>>>>>>>> time >>>>>>>>>>>>>>>> 2014-08-13 17:52:56.615073 >>>>>>>>>>>>>>>> os/GenericObjectMap.cc: 1118: FAILED assert(start <= >>>>>>>>>>>>>>>> header.oid) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8) >>>>>>>>>>>>>>>> 1: (GenericObjectMap::list_objects(coll_t const&, ghobject_t, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>>>>>>>>>>>>> ghobject_t*)+0x474) >>>>>>>>>>>>>>>> [0x98f774] >>>>>>>>>>>>>>>> 2: (KeyValueStore::collection_list_partial(coll_t, >>>>>>>>>>>>>>>> ghobject_t, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54] >>>>>>>>>>>>>>>> 3: (PGBackend::objects_list_partial(hobject_t const&, int, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> snapid_t, >>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, >>>>>>>>>>>>>>>> hobject_t*)+0x1c9) >>>>>>>>>>>>>>>> [0x862de9] >>>>>>>>>>>>>>>> 4: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5) >>>>>>>>>>>>>>>> [0x7f67f5] >>>>>>>>>>>>>>>> 5: >>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) >>>>>>>>>>>>>>>> [0x8177b3] >>>>>>>>>>>>>>>> 6: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, >>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] >>>>>>>>>>>>>>>> 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>, >>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d) >>>>>>>>>>>>>>>> [0x62bf8d] >>>>>>>>>>>>>>>> 8: (OSD::ShardedOpWQ::_process(unsigned int, >>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] >>>>>>>>>>>>>>>> 9: (ShardedThreadPool::shardedthreadpool_worker(unsigned >>>>>>>>>>>>>>>> int)+0x8cd) >>>>>>>>>>>>>>>> [0xa776fd] >>>>>>>>>>>>>>>> 10: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) >>>>>>>>>>>>>>>> [0xa79980] >>>>>>>>>>>>>>>> 11: (()+0x7df3) [0x7f8aac71fdf3] >>>>>>>>>>>>>>>> 12: (clone()+0x6d) [0x7f8aab1963dd] >>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS >>>>>>>>>>>>>>>> <executable>` >>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>> needed >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>> interpret this. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8) >>>>>>>>>>>>>>>> 1: /usr/bin/ceph-osd() [0x99b466] >>>>>>>>>>>>>>>> 2: (()+0xf130) [0x7f8aac727130] >>>>>>>>>>>>>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989] >>>>>>>>>>>>>>>> 4: (abort()+0x148) [0x7f8aab0d7098] >>>>>>>>>>>>>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) >>>>>>>>>>>>>>>> [0x7f8aab9e89d5] >>>>>>>>>>>>>>>> 6: (()+0x5e946) [0x7f8aab9e6946] >>>>>>>>>>>>>>>> 7: (()+0x5e973) [0x7f8aab9e6973] >>>>>>>>>>>>>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f] >>>>>>>>>>>>>>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, >>>>>>>>>>>>>>>> char >>>>>>>>>>>>>>>> const*)+0x1ef) [0xa8805f] >>>>>>>>>>>>>>>> 10: (GenericObjectMap::list_objects(coll_t const&, >>>>>>>>>>>>>>>> ghobject_t, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>>>>>>>>>>>>> ghobject_t*)+0x474) >>>>>>>>>>>>>>>> [0x98f774] >>>>>>>>>>>>>>>> 11: (KeyValueStore::collection_list_partial(coll_t, >>>>>>>>>>>>>>>> ghobject_t, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54] >>>>>>>>>>>>>>>> 12: (PGBackend::objects_list_partial(hobject_t const&, int, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> snapid_t, >>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, >>>>>>>>>>>>>>>> hobject_t*)+0x1c9) >>>>>>>>>>>>>>>> [0x862de9] >>>>>>>>>>>>>>>> 13: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5) >>>>>>>>>>>>>>>> [0x7f67f5] >>>>>>>>>>>>>>>> 14: >>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) >>>>>>>>>>>>>>>> [0x8177b3] >>>>>>>>>>>>>>>> 15: >>>>>>>>>>>>>>>> (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, >>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] >>>>>>>>>>>>>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>, >>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d) >>>>>>>>>>>>>>>> [0x62bf8d] >>>>>>>>>>>>>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int, >>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] >>>>>>>>>>>>>>>> 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned >>>>>>>>>>>>>>>> int)+0x8cd) >>>>>>>>>>>>>>>> [0xa776fd] >>>>>>>>>>>>>>>> 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) >>>>>>>>>>>>>>>> [0xa79980] >>>>>>>>>>>>>>>> 20: (()+0x7df3) [0x7f8aac71fdf3] >>>>>>>>>>>>>>>> 21: (clone()+0x6d) [0x7f8aab1963dd] >>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS >>>>>>>>>>>>>>>> <executable>` >>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>> needed >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>> interpret this. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --- begin dump of recent events --- >>>>>>>>>>>>>>>> 0> 2014-08-13 17:52:56.714214 7f8a92f9a700 -1 *** Caught >>>>>>>>>>>>>>>> signal >>>>>>>>>>>>>>>> (Aborted) ** >>>>>>>>>>>>>>>> in thread 7f8a92f9a700 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8) >>>>>>>>>>>>>>>> 1: /usr/bin/ceph-osd() [0x99b466] >>>>>>>>>>>>>>>> 2: (()+0xf130) [0x7f8aac727130] >>>>>>>>>>>>>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989] >>>>>>>>>>>>>>>> 4: (abort()+0x148) [0x7f8aab0d7098] >>>>>>>>>>>>>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) >>>>>>>>>>>>>>>> [0x7f8aab9e89d5] >>>>>>>>>>>>>>>> 6: (()+0x5e946) [0x7f8aab9e6946] >>>>>>>>>>>>>>>> 7: (()+0x5e973) [0x7f8aab9e6973] >>>>>>>>>>>>>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f] >>>>>>>>>>>>>>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, >>>>>>>>>>>>>>>> char >>>>>>>>>>>>>>>> const*)+0x1ef) [0xa8805f] >>>>>>>>>>>>>>>> 10: (GenericObjectMap::list_objects(coll_t const&, >>>>>>>>>>>>>>>> ghobject_t, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>>>>>>>>>>>>> ghobject_t*)+0x474) >>>>>>>>>>>>>>>> [0x98f774] >>>>>>>>>>>>>>>> 11: (KeyValueStore::collection_list_partial(coll_t, >>>>>>>>>>>>>>>> ghobject_t, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ghobject_t*)+0x274) [0x8c5b54] >>>>>>>>>>>>>>>> 12: (PGBackend::objects_list_partial(hobject_t const&, int, >>>>>>>>>>>>>>>> int, >>>>>>>>>>>>>>>> snapid_t, >>>>>>>>>>>>>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, >>>>>>>>>>>>>>>> hobject_t*)+0x1c9) >>>>>>>>>>>>>>>> [0x862de9] >>>>>>>>>>>>>>>> 13: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5) >>>>>>>>>>>>>>>> [0x7f67f5] >>>>>>>>>>>>>>>> 14: >>>>>>>>>>>>>>>> (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) >>>>>>>>>>>>>>>> [0x8177b3] >>>>>>>>>>>>>>>> 15: >>>>>>>>>>>>>>>> (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, >>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] >>>>>>>>>>>>>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>>>>>>>>>>>>>>> std::tr1::shared_ptr<OpRequest>, >>>>>>>>>>>>>>>> ThreadPool::TPHandle&)+0x47d) >>>>>>>>>>>>>>>> [0x62bf8d] >>>>>>>>>>>>>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int, >>>>>>>>>>>>>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] >>>>>>>>>>>>>>>> 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned >>>>>>>>>>>>>>>> int)+0x8cd) >>>>>>>>>>>>>>>> [0xa776fd] >>>>>>>>>>>>>>>> 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) >>>>>>>>>>>>>>>> [0xa79980] >>>>>>>>>>>>>>>> 20: (()+0x7df3) [0x7f8aac71fdf3] >>>>>>>>>>>>>>>> 21: (clone()+0x6d) [0x7f8aab1963dd] >>>>>>>>>>>>>>>> NOTE: a copy of the executable, or `objdump -rdS >>>>>>>>>>>>>>>> <executable>` >>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>> needed >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>> interpret this. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I guess this has something to do with using the dev >>>>>>>>>>>>>>>> Keyvaluestore? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Kenneth >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> ceph-users mailing list >>>>>>>>>>>>>>>> ceph-users at lists.ceph.com >>>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Wheat >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> Met vriendelijke groeten, >>>>>>>>>>>>>> Kenneth Waegeman >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>> >>>>>>>>>>>>> Wheat >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Met vriendelijke groeten, >>>>>>>>>>>> Kenneth Waegeman >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Best Regards, >>>>>>>>>>> >>>>>>>>>>> Wheat >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> ceph-users mailing list >>>>>>>>>>> ceph-users at lists.ceph.com >>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ----- End message from Sage Weil <sweil at redhat.com> ----- >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Met vriendelijke groeten, >>>>>>>>> Kenneth Waegeman >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best Regards, >>>>>>>> >>>>>>>> Wheat >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Met vriendelijke groeten, >>>>>>> Kenneth Waegeman >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> >>>>>> Wheat >>>>> >>>>> >>>>> >>>>> >>>>> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- >>>>> >>>>> -- >>>>> >>>>> Met vriendelijke groeten, >>>>> Kenneth Waegeman >>>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> >>>> Wheat >>> >>> >>> >>> >>> -- >>> Best Regards, >>> >>> Wheat >> >> >> >> ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- >> >> -- >> >> Met vriendelijke groeten, >> Kenneth Waegeman >> >> >> > > > > -- > Best Regards, > > Wheat ----- End message from Haomai Wang <haomaiwang at gmail.com> ----- -- Met vriendelijke groeten, Kenneth Waegeman