Hi, There is any chance to restore my data ? Regards Pierre Le 07/07/2014 15:42, Pierre BLONDEAU a ?crit : > No chance to have those logs and even less in debug mode. I do this > change 3 weeks ago. > > I put all my log here if it's can help : > https://blondeau.users.greyc.fr/cephlog/all/ > > I have a chance to recover my +/- 20TB of data ? > > Regards > > Le 03/07/2014 21:48, Joao Luis a ?crit : >> Do those logs have a higher debugging level than the default? If not >> nevermind as they will not have enough information. If they do however, >> we'd be interested in the portion around the moment you set the >> tunables. Say, before the upgrade and a bit after you set the tunable. >> If you want to be finer grained, then ideally it would be the moment >> where those maps were created, but you'd have to grep the logs for that. >> >> Or drop the logs somewhere and I'll take a look. >> >> -Joao >> >> On Jul 3, 2014 5:48 PM, "Pierre BLONDEAU" <pierre.blondeau at unicaen.fr >> <mailto:pierre.blondeau at unicaen.fr>> wrote: >> >> Le 03/07/2014 13:49, Joao Eduardo Luis a ?crit : >> >> On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote: >> >> Le 03/07/2014 00:55, Samuel Just a ?crit : >> >> Ah, >> >> ~/logs ? for i in 20 23; do ../ceph/src/osdmaptool >> --export-crush >> /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d >> /tmp/crush$i > >> /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d >> ../ceph/src/osdmaptool: osdmap file >> 'osd-20_osdmap.13258__0___4E62BB79__none' >> ../ceph/src/osdmaptool: exported crush map to >> /tmp/crush20 >> ../ceph/src/osdmaptool: osdmap file >> 'osd-23_osdmap.13258__0___4E62BB79__none' >> ../ceph/src/osdmaptool: exported crush map to >> /tmp/crush23 >> 6d5 >> < tunable chooseleaf_vary_r 1 >> >> Looks like the chooseleaf_vary_r tunable somehow ended >> up divergent? >> >> >> The only thing that comes to mind that could cause this is if we >> changed >> the leader's in-memory map, proposed it, it failed, and only the >> leader >> got to write the map to disk somehow. This happened once on a >> totally >> different issue (although I can't pinpoint right now which). >> >> In such a scenario, the leader would serve the incorrect >> osdmap to >> whoever asked osdmaps from it, the remaining quorum would >> serve the >> correct osdmaps to all the others. This could cause this >> divergence. Or >> it could be something else. >> >> Are there logs for the monitors for the timeframe this may have >> happened >> in? >> >> >> Which exactly timeframe you want ? I have 7 days of logs, I should >> have informations about the upgrade from firefly to 0.82. >> Which mon's log do you want ? Three ? >> >> Regards >> >> -Joao >> >> >> Pierre: do you recall how and when that got set? >> >> >> I am not sure to understand, but if I good remember after >> the update in >> firefly, I was in state : HEALTH_WARN crush map has legacy >> tunables and >> I see "feature set mismatch" in log. >> >> So if I good remeber, i do : ceph osd crush tunables optimal >> for the >> problem of "crush map" and I update my client and server >> kernel to >> 3.16rc. >> >> It's could be that ? >> >> Pierre >> >> -Sam >> >> On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just >> <sam.just at inktank.com <mailto:sam.just at inktank.com>> >> wrote: >> >> Yeah, divergent osdmaps: >> 555ed048e73024687fc8b106a570db__4f >> osd-20_osdmap.13258__0___4E62BB79__none >> 6037911f31dc3c18b05499d24dcdbe__5c >> osd-23_osdmap.13258__0___4E62BB79__none >> >> Joao: thoughts? >> -Sam >> >> On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU >> <pierre.blondeau at unicaen.fr >> <mailto:pierre.blondeau at unicaen.fr>> wrote: >> >> The files >> >> When I upgrade : >> ceph-deploy install --stable firefly >> servers... >> on each servers service ceph restart mon >> on each servers service ceph restart osd >> on each servers service ceph restart mds >> >> I upgraded from emperor to firefly. After >> repair, remap, replace, >> etc ... I >> have some PG which pass in peering state. >> >> I thought why not try the version 0.82, it could >> solve my problem. ( >> It's my mistake ). So, I upgrade from firefly to >> 0.83 with : >> ceph-deploy install --testing servers... >> .. >> >> Now, all programs are in version 0.82. >> I have 3 mons, 36 OSD and 3 mds. >> >> Pierre >> >> PS : I find also >> "inc\uosdmap.13258__0___469271DE__none" on >> each meta >> directory. >> >> Le 03/07/2014 00:10, Samuel Just a ?crit : >> >> Also, what version did you upgrade from, and >> how did you upgrade? >> -Sam >> >> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just >> <sam.just at inktank.com >> <mailto:sam.just at inktank.com>> >> wrote: >> >> >> Ok, in current/meta on osd 20 and osd >> 23, please attach all files >> matching >> >> ^osdmap.13258.* >> >> There should be one such file on each >> osd. (should look something >> like >> osdmap.6__0_FD6E4C01__none, probably >> hashed into a subdirectory, >> you'll want to use find). >> >> What version of ceph is running on your >> mons? How many mons do >> you have? >> -Sam >> >> On Wed, Jul 2, 2014 at 2:21 PM, Pierre >> BLONDEAU >> <pierre.blondeau at unicaen.fr >> <mailto:pierre.blondeau at unicaen.fr>> >> wrote: >> >> >> Hi, >> >> I do it, the log files are available >> here : >> >> https://blondeau.users.greyc.__fr/cephlog/debug20/ >> >> <https://blondeau.users.greyc.fr/cephlog/debug20/> >> >> The OSD's files are really big +/- >> 80M . >> >> After starting the osd.20 some other >> osd crash. I pass from 31 >> osd up to >> 16. >> I remark that after this the number >> of down+peering PG decrease >> from 367 >> to >> 248. It's "normal" ? May be it's >> temporary, the time that the >> cluster >> verifies all the PG ? >> >> Regards >> Pierre >> >> Le 02/07/2014 19:16, Samuel Just a >> ?crit : >> >> You should add >> >> debug osd = 20 >> debug filestore = 20 >> debug ms = 1 >> >> to the [osd] section of the >> ceph.conf and restart the >> osds. I'd >> like >> all three logs if possible. >> >> Thanks >> -Sam >> >> On Wed, Jul 2, 2014 at 5:03 AM, >> Pierre BLONDEAU >> <pierre.blondeau at unicaen.fr >> >> <mailto:pierre.blondeau at unicaen.fr>> >> wrote: >> >> >> >> Yes, but how i do that ? >> >> With a command like that ? >> >> ceph tell osd.20 injectargs >> '--debug-osd 20 >> --debug-filestore 20 >> --debug-ms >> 1' >> >> By modify the >> /etc/ceph/ceph.conf ? This >> file is really poor >> because I >> use >> udev detection. >> >> When I have made these >> changes, you want the three >> log files or >> only >> osd.20's ? >> >> Thank you so much for the >> help >> >> Regards >> Pierre >> >> Le 01/07/2014 23:51, Samuel >> Just a ?crit : >> >> Can you reproduce with >> debug osd = 20 >> debug filestore = 20 >> debug ms = 1 >> ? >> -Sam >> >> On Tue, Jul 1, 2014 at >> 1:21 AM, Pierre BLONDEAU >> >> <pierre.blondeau at unicaen.fr >> >> <mailto:pierre.blondeau at unicaen.fr>> >> wrote: >> >> >> >> >> Hi, >> >> I join : >> - osd.20 is >> one of osd that I >> detect which makes >> crash >> other >> OSD. >> - osd.23 is >> one of osd which >> crash when i start >> osd.20 >> - mds, is one >> of my MDS >> >> I cut log file >> because they are to >> big but. All is >> here : >> >> https://blondeau.users.greyc.__fr/cephlog/ >> >> <https://blondeau.users.greyc.fr/cephlog/> >> >> Regards >> >> Le 30/06/2014 17:35, >> Gregory Farnum a >> ?crit : >> >> What's the >> backtrace from >> the crashing >> OSDs? >> >> Keep in mind >> that as a dev >> release, it's >> generally best >> not to >> upgrade >> to unnamed >> versions like >> 0.82 (but it's >> probably too late >> to go >> back >> now). >> >> >> >> >> I will remember it >> the next time ;) >> >> -Greg >> Software >> Engineer #42 @ >> >> http://inktank.com >> | http://ceph.com >> >> On Mon, Jun 30, >> 2014 at 8:06 AM, >> Pierre BLONDEAU >> >> <pierre.blondeau at unicaen.fr >> >> <mailto:pierre.blondeau at unicaen.fr>> >> wrote: >> >> >> >> Hi, >> >> After the >> upgrade to >> firefly, I >> have some PG >> in peering >> state. >> I seen the >> output of >> 0.82 so I >> try to >> upgrade for >> solved my >> problem. >> >> My three MDS >> crash and >> some OSD >> triggers a >> chain >> reaction >> that >> kills >> other >> OSD. >> I think my >> MDS will not >> start >> because of >> the >> metadata are >> on the >> OSD. >> >> I have 36 >> OSD on three >> servers and >> I identified >> 5 OSD which >> makes >> crash >> others. If i >> not start >> their, the >> cluster >> passe in >> >> reconstructive >> state >> with >> 31 OSD but i >> have 378 in >> down+peering >> state. >> >> How can I do >> ? Would you >> more >> information >> ( os, >> crash log, >> etc ... >> ) >> ? >> >> Regards >> >> >> >> >> -- >> >> ------------------------------__---------------- >> Pierre BLONDEAU >> Administrateur Syst?mes & r?seaux >> Universit? de Caen >> Laboratoire GREYC, D?partement >> d'informatique >> >> tel : 02 31 56 75 42 >> bureau : Campus 2, Science 3, 406 >> >> ------------------------------__---------------- >> >> >> >> -- >> ------------------------------__---------------- >> Pierre BLONDEAU >> Administrateur Syst?mes & r?seaux >> Universit? de Caen >> Laboratoire GREYC, D?partement d'informatique >> >> tel : 02 31 56 75 42 >> bureau : Campus 2, Science 3, 406 >> ------------------------------__---------------- >> >> >> >> >> >> >> >> -- >> ------------------------------__---------------- >> Pierre BLONDEAU >> Administrateur Syst?mes & r?seaux >> Universit? de Caen >> Laboratoire GREYC, D?partement d'informatique >> >> tel : 02 31 56 75 42 >> bureau : Campus 2, Science 3, 406 >> ------------------------------__---------------- >> >> > > > > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- ---------------------------------------------- Pierre BLONDEAU Administrateur Syst?mes & r?seaux Universit? de Caen Laboratoire GREYC, D?partement d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 ---------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2947 bytes Desc: Signature cryptographique S/MIME URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140709/a148882b/attachment.bin>