Thank you for getting back to me with more. What my understanding is what you would like to do are: 1.How you recover broken metadata / data. 2.How you avoid same condition from the next. Regarding to No.2, developers should have this responsibility. Because you can not do anything, once system is hung. So you just have no choice except for rebooting system or something you did -; That's not quite good. At least system says something like: Oh, wait wait, we are doing something right now... -; and then solves the issue as background process. Shinobu ----- Original Message ----- From: "Goncalo Borges" <goncalo@xxxxxxxxxxxxxxxxxxx> To: "Shinobu Kinjo" <skinjo@xxxxxxxxxx>, "John Spray" <jspray@xxxxxxxxxx> Cc: ceph-users@xxxxxxxxxxxxxx Sent: Tuesday, September 15, 2015 12:39:57 PM Subject: Re: Question on cephfs recovery tools Hi Shinobu >>> c./ After recovering the cluster, I though I was in a cephfs situation where >>> I had >>> c.1 files with holes (because of lost PGs and objects in the data pool) >>> c.2 files without metadata (because of lost PGs and objects in the >>> metadata pool) >> What does "files without metadata" mean? Do you mean their objects >> were in the data pool but they didn't appear in your filesystem mount? >> >>> c.3 metadata without associated files (because of lost PGs and objects >>> in the data pool) >> So you mean you had files with the expected size but zero data, right? >> >>> I've tried to run the recovery tools, but I have several doubts which I did >>> not found described in the documentation >>> - Is there a specific order / a way to run the tools for the c.1, c.2 >>> and c.3 cases I mentioned? > I'm still trying to understand what you try to say in your > original message but I have not been able to get you yet. > > Can you summarize like: > > 1. What current status is. > e.g: working but not as expected. > > 2. What your thought (, guess or whatever) is about your cluster. > e.g: broken metadata, data or whatever you're thinking now. > > 3. What you exactly did shortly not bla bla bla... > > 4. What you really want to do (shortly)? I was trying to give the full context of my tests so that all the information is available. After Jonh's response, and some further thinking, I think I understand partially what actions have to be done in a scenario like the one I've created The whole idea is, given a scenario where there is loss of data and metadata, what can be done from the admin side to recover the cephfs. Nevertheless, since this email thread is already long, I'll try to send a new email more focused. Cheers and Thanks for the replies Goncalo -- Goncalo Borges Research Computing ARC Centre of Excellence for Particle Physics at the Terascale School of Physics A28 | University of Sydney, NSW 2006 T: +61 2 93511937 _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com