The files When I upgrade : ceph-deploy install --stable firefly servers... on each servers service ceph restart mon on each servers service ceph restart osd on each servers service ceph restart mds I upgraded from emperor to firefly. After repair, remap, replace, etc ... I have some PG which pass in peering state. I thought why not try the version 0.82, it could solve my problem. ( It's my mistake ). So, I upgrade from firefly to 0.83 with : ceph-deploy install --testing servers... .. Now, all programs are in version 0.82. I have 3 mons, 36 OSD and 3 mds. Pierre PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta directory. Le 03/07/2014 00:10, Samuel Just a ?crit : > Also, what version did you upgrade from, and how did you upgrade? > -Sam > > On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just <sam.just at inktank.com> wrote: >> Ok, in current/meta on osd 20 and osd 23, please attach all files matching >> >> ^osdmap.13258.* >> >> There should be one such file on each osd. (should look something like >> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory, >> you'll want to use find). >> >> What version of ceph is running on your mons? How many mons do you have? >> -Sam >> >> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU >> <pierre.blondeau at unicaen.fr> wrote: >>> Hi, >>> >>> I do it, the log files are available here : >>> https://blondeau.users.greyc.fr/cephlog/debug20/ >>> >>> The OSD's files are really big +/- 80M . >>> >>> After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. >>> I remark that after this the number of down+peering PG decrease from 367 to >>> 248. It's "normal" ? May be it's temporary, the time that the cluster >>> verifies all the PG ? >>> >>> Regards >>> Pierre >>> >>> Le 02/07/2014 19:16, Samuel Just a ?crit : >>> >>>> You should add >>>> >>>> debug osd = 20 >>>> debug filestore = 20 >>>> debug ms = 1 >>>> >>>> to the [osd] section of the ceph.conf and restart the osds. I'd like >>>> all three logs if possible. >>>> >>>> Thanks >>>> -Sam >>>> >>>> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU >>>> <pierre.blondeau at unicaen.fr> wrote: >>>>> >>>>> Yes, but how i do that ? >>>>> >>>>> With a command like that ? >>>>> >>>>> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 >>>>> --debug-ms >>>>> 1' >>>>> >>>>> By modify the /etc/ceph/ceph.conf ? This file is really poor because I >>>>> use >>>>> udev detection. >>>>> >>>>> When I have made these changes, you want the three log files or only >>>>> osd.20's ? >>>>> >>>>> Thank you so much for the help >>>>> >>>>> Regards >>>>> Pierre >>>>> >>>>> Le 01/07/2014 23:51, Samuel Just a ?crit : >>>>> >>>>>> Can you reproduce with >>>>>> debug osd = 20 >>>>>> debug filestore = 20 >>>>>> debug ms = 1 >>>>>> ? >>>>>> -Sam >>>>>> >>>>>> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU >>>>>> <pierre.blondeau at unicaen.fr> wrote: >>>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I join : >>>>>>> - osd.20 is one of osd that I detect which makes crash other OSD. >>>>>>> - osd.23 is one of osd which crash when i start osd.20 >>>>>>> - mds, is one of my MDS >>>>>>> >>>>>>> I cut log file because they are to big but. All is here : >>>>>>> https://blondeau.users.greyc.fr/cephlog/ >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> Le 30/06/2014 17:35, Gregory Farnum a ?crit : >>>>>>> >>>>>>>> What's the backtrace from the crashing OSDs? >>>>>>>> >>>>>>>> Keep in mind that as a dev release, it's generally best not to upgrade >>>>>>>> to unnamed versions like 0.82 (but it's probably too late to go back >>>>>>>> now). >>>>>>> >>>>>>> >>>>>>> I will remember it the next time ;) >>>>>>> >>>>>>>> -Greg >>>>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com >>>>>>>> >>>>>>>> On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU >>>>>>>> <pierre.blondeau at unicaen.fr> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> After the upgrade to firefly, I have some PG in peering state. >>>>>>>>> I seen the output of 0.82 so I try to upgrade for solved my problem. >>>>>>>>> >>>>>>>>> My three MDS crash and some OSD triggers a chain reaction that kills >>>>>>>>> other >>>>>>>>> OSD. >>>>>>>>> I think my MDS will not start because of the metadata are on the OSD. >>>>>>>>> >>>>>>>>> I have 36 OSD on three servers and I identified 5 OSD which makes >>>>>>>>> crash >>>>>>>>> others. If i not start their, the cluster passe in reconstructive >>>>>>>>> state >>>>>>>>> with >>>>>>>>> 31 OSD but i have 378 in down+peering state. >>>>>>>>> >>>>>>>>> How can I do ? Would you more information ( os, crash log, etc ... ) >>>>>>>>> ? >>>>>>>>> >>>>>>>>> Regards >>> >>> >>> -- >>> ---------------------------------------------- >>> Pierre BLONDEAU >>> Administrateur Syst?mes & r?seaux >>> Universit? de Caen >>> Laboratoire GREYC, D?partement d'informatique >>> >>> tel : 02 31 56 75 42 >>> bureau : Campus 2, Science 3, 406 >>> ---------------------------------------------- >>> -- ---------------------------------------------- Pierre BLONDEAU Administrateur Syst?mes & r?seaux Universit? de Caen Laboratoire GREYC, D?partement d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 ---------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: osd-20_osdmap.13258__0_4E62BB79__none Type: application/octet-stream Size: 25423 bytes Desc: not available URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140703/6187a83b/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: osd-23_osdmap.13258__0_4E62BB79__none Type: application/octet-stream Size: 25423 bytes Desc: not available URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140703/6187a83b/attachment-0001.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2947 bytes Desc: Signature cryptographique S/MIME URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140703/6187a83b/attachment.bin>