Re: Power outages!!! help!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm jumping in a little late here, but running xfs_repair on your partition can't frag your partition table. The partition table lives outside the partition block device and xfs_repair doesn't have access to it when run against /dev/sdb1. I haven't actually tested it, but it seems unlikely that running xfs_repair on /dev/sdb would do it either. I would assume it would just give you an error about /dev/sdb not containing an XFS filesystem. That's a guess though. I haven't ever tried anything like that.

Are you sure there isn't physical damage to the disk? I wouldn't say it's common, but power outages can do that. You can run 'dmesg | grep sdb' and 'smartctl -a /dev/sdb' to see if there are kernel errors or SMART errors indicative of physical problems. If the disk is physically sound and the partition table really has been fragged, you may be able to restore it from the backup at the end of the disk, assuming it's GPT. If you can't find a partition or a filesystem somehow, then you're probably out of luck as far as retrieving any objects from that OSD. If the disk is physically damaged and your partition is gone, then it probably isn't worth wasting additional time on it.


Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |


If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited.


On Mon, 2017-08-28 at 19:18 +0000, hjcho616 wrote:
Tomasz,

Looks like when I did xfs_repair -L /dev/sdb1 it did something to partition table and I don't see /dev/sdb1 anymore... or maybe I missed 1 in the /dev/sdb1? =(. Yes.. that extra power outage did a pretty good damage... =P  I am hoping 0.007% is very small...=P  Any recommendations on fixing xfs partition I am missing? =)

Ronny,

Thank you for that link!

No I haven't done anything to osds... not touching them, hoping that I can revive some of them.. =)  Only thing done is trying to start and stop them..

Below are the links to newer files with just one start attempt. =)









Regards,
Hong


On Monday, August 28, 2017 12:53 PM, Ronny Aasen <ronny+ceph-users@xxxxxxxx> wrote:


comments inline

On 28.08.2017 18:31, hjcho616 wrote:


I'll see what I can do on that... Looks like I may have to add another OSD host as I utilized all of the SATA ports on those boards. =P

Ronny,

I am running with size=2 min_size=1.  I created everything with ceph-deploy and didn't touch much of that pool settings...  I hope not, but sounds like I may have lost some files!  I do want some of those OSDs to come back online somehow... to get that confidence level up. =P


This is a bad idea as you have found out. once your cluster is healthy you should look at improving this.

The dead osd.3 message is probably me trying to stop and start the osd.  There were some cases where stop didn't kill the ceph-osd process.  I just started or restarted osd to try and see if that worked..  After that, there were some reboots and I am not seeing those messages after it...


when providing logs. try to move away the old one. do a single startup. and post that. it makes it easier to read when you have a single run in the file.


This is something I am running at home.  I am the only user.  In a way it is production environment but just driven by me. =)

Do you have any suggestions to get any of those osd.3, osd.4, osd.5, and osd.8 come back up without removing them?  I have a feeling I can get some data back with some of them intact.

just incase you are not able to make them run again, does not automatically mean the data is lost. i have successfully recovered lost object using these instructions  http://ceph.com/geen-categorie/incomplete-pgs-oh-my/ 

I would start by  renaming the osd's log file, do a single try at starting the osd. and posting that log. have you done anything to the osd's that could make them not run ?


kind regards
Ronny Aasen
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux