Re: Can't mount Cephfs

John Spray <jspray@xxxxxxxxxx> · Thu, 27 Aug 2015 11:11:14 +0100

On Thu, Aug 27, 2015 at 9:33 AM, Andrzej Łukawski <alukawski@xxxxxxxxxx> wrote:
> Hi,
>
> I ran cephfs-journal-tool to inspect journal 12 hours ago - it's still
> running. Or... it didn't crush yet, although I don't see any output from it.
> Is it normal behaviour?

Your cluster is severely damaged, at the RADOS level.  I would imagine
that the journal tool is getting stuck the same way the MDS is getting
stuck: when trying to read some objects from a fatally damaged PG, it
is blocking, waiting (in vain) for the PG to become healthy again.

It is unlikely that you will get a functioning filesystem back with
existing tools.  Even if you could, you would have to fix the
underlying RADOS cluster before doing anything with CephFS.  You
really are in "restore from backups" territory here.

John

> Thanks for help.
>
> Andrzej
>
> W dniu 2015-08-26 o 15:49, Gregory Farnum pisze:
>
>> There is a cephfs-journal-tool that I believe is present in hammer and
>> ought to let you get your MDS through replay. Depending on which PGs
>> were lost you will have holes and/or missing files, in addition to not
>> being able to find parts of the directory hierarchy (and maybe getting
>> crashes if you access them). You can explore the options there and if
>> the documentation is sparse, feel free to ask questions...
>> -Greg
>>
>> On Wed, Aug 26, 2015 at 1:44 PM, Andrzej Łukawski <alukawski@xxxxxxxxxx>
>> wrote:
>>>
>>> Thank you for answer. I lost 2 disks on 1st node and 1 disk on 2nd. I
>>> understand it is not possible to recover the data even partially?
>>> Unfortunatelly those disks are lost forever.
>>>
>>> Andrzej
>>>
>>> W dniu 2015-08-26 o 12:26, Jan Schermer pisze:
>>>
>>> If you lost 3 disks with size 2 and at least 2 of those disks were in
>>> different host, that means you lost data with the default CRUSH.
>>> There's nothing you can do but either get those disks back in or recover
>>> from backup.
>>>
>>> Jan
>>>
>>> On 26 Aug 2015, at 12:18, Andrzej Łukawski <alukawski@xxxxxxxxxx> wrote:
>>>
>>> Hi,
>>>
>>> We have ceph cluster (Ceph version 0.94.2) which consists of four nodes
>>> with
>>> four disks on each node. Ceph is configured to hold two replicas (size
>>> 2).
>>> We use this cluster for ceph filesystem. Few days ago we had power outage
>>> after which I had to replace three of our cluster OSD disks. All OSD
>>> disks
>>> are now online, but I'm unable to mount filesystem and constantly receive
>>> 'mount error 5 = Input/output error'.  Ceph status shows many
>>> 'incomplete'
>>> pgs and that 'mds cluster is degraded'. According to 'ceph health detail'
>>> mds is replaying journal.
>>>
>>> [root@cnode0 ceph]# ceph -s
>>>      cluster 39c717a3-5e15-4e5e-bc54-7e7f1fd0ee24
>>>       health HEALTH_WARN
>>>              25 pgs backfill_toofull
>>>              10 pgs degraded
>>>              126 pgs down
>>>              263 pgs incomplete
>>>              54 pgs stale
>>>              10 pgs stuck degraded
>>>              263 pgs stuck inactive
>>>              54 pgs stuck stale
>>>              289 pgs stuck unclean
>>>              10 pgs stuck undersized
>>>              10 pgs undersized
>>>              4 requests are blocked > 32 sec
>>>              recovery 27139/10407227 objects degraded (0.261%)
>>>              recovery 168597/10407227 objects misplaced (1.620%)
>>>              4 near full osd(s)
>>>              too many PGs per OSD (312 > max 300)
>>>              mds cluster is degraded
>>>       monmap e6: 6 mons at
>>>
>>> {0=x.x.70.1:6789/0,0m=x.x.71.1:6789/0,1=x.x.70.2:6789/0,1m=x.x.71.2:6789/0,2=x.x.70.3:6789/0,2m=x.x.71.3:6789/0}
>>>              election epoch 2958, quorum 0,1,2,3,4,5 0,1,2,0m,1m,2m
>>>       mdsmap e1236: 1/1/1 up {0=2=up:replay}, 2 up:standby
>>>       osdmap e83705: 16 osds: 16 up, 16 in; 26 remapped pgs
>>>        pgmap v40869228: 2496 pgs, 3 pools, 16952 GB data, 5046 kobjects
>>>              32825 GB used, 11698 GB / 44524 GB avail
>>>              27139/10407227 objects degraded (0.261%)
>>>              168597/10407227 objects misplaced (1.620%)
>>>                  2153 active+clean
>>>                   137 incomplete
>>>                   126 down+incomplete
>>>                    54 stale+active+clean
>>>                    15 active+remapped+backfill_toofull
>>>                    10
>>> active+undersized+degraded+remapped+backfill_toofull
>>>                     1 active+remapped
>>> [root@cnode0 ceph]#
>>>
>>> I wasn't able to find any solution in the Internet and I worry I will
>>> make
>>> things even worse when continue to troubleshoot this on my own. I'm
>>> stuck.
>>> Could you please help?
>>>
>>> Thanks.
>>> Andrzej
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com