Re: Can't mount Cephfs

Jan Schermer <jan@xxxxxxxxxxx> · Wed, 26 Aug 2015 16:36:06 +0200

Most of the data is still here, but you won't be able to just "mount" it if it's inconsistent.
I don't use CephFS so someone else could tell you if it's able to repair the filesystem with some parts missing.

You lost part of the data where the copies were only on the 1 disk in one node and on either of the disks on the other node since no other copy exists. How much data you lost I don't exactly know, but since you only have 16 OSDs I'm afraid it will be in the order of  ~3% probably? How many "files" are intact is a different question - it could be that every file is missing 3% of contents which would make the loss total.

Guys? I have no idea how files map to pgs and object in CephFS...

Jan

On 26 Aug 2015, at 14:44, Andrzej Łukawski <alukawski@xxxxxxxxxx> wrote:

    Thank you for answer. I lost 2 disks on
      1st node and 1 disk on 2nd. I understand it is not possible to
      recover the data even partially? Unfortunatelly those disks are
      lost forever.

      Andrzej

      W dniu 2015-08-26 o 12:26, Jan Schermer pisze:

      If you lost 3 disks with size 2 and at least 2 of those disks were
      in different host, that means you lost data with the default
      CRUSH.
      There's nothing you can do but either get those
        disks back in or recover from backup.

      Jan

            On 26 Aug 2015, at 12:18, Andrzej Łukawski
              <alukawski@xxxxxxxxxx>
              wrote:

               Hi,

                We have ceph cluster (Ceph version 0.94.2) which
                consists of four nodes with four disks on each node.
                Ceph is configured to hold two replicas (size 2). We use
                this cluster for ceph filesystem. Few days ago we had
                power outage after which I had to replace three of our
                cluster OSD disks. All OSD disks are now online, but I'm
                unable to mount filesystem and constantly receive 'mount
                error 5 = Input/output error'.  Ceph status shows many
                'incomplete' pgs and that 'mds cluster is degraded'.
                According to 'ceph health detail' mds is replaying
                journal. 

                [root@cnode0 ceph]# ceph -s

                    cluster 39c717a3-5e15-4e5e-bc54-7e7f1fd0ee24

                     health HEALTH_WARN

                            25 pgs backfill_toofull

                            10 pgs degraded

                            126 pgs down

                            263 pgs incomplete

                            54 pgs stale

                            10 pgs stuck degraded

                            263 pgs stuck inactive

                            54 pgs stuck stale

                            289 pgs stuck unclean

                            10 pgs stuck undersized

                            10 pgs undersized

                            4 requests are blocked > 32 sec

                            recovery 27139/10407227 objects degraded
                (0.261%)

                            recovery 168597/10407227 objects misplaced
                (1.620%)

                            4 near full osd(s)

                            too many PGs per OSD (312 > max 300)

                            mds cluster is degraded

                     monmap e6: 6 mons at
{0=x.x.70.1:6789/0,0m=x.x.71.1:6789/0,1=x.x.70.2:6789/0,1m=x.x.71.2:6789/0,2=x.x.70.3:6789/0,2m=x.x.71.3:6789/0}

                            election epoch 2958, quorum 0,1,2,3,4,5
                0,1,2,0m,1m,2m

                     mdsmap e1236: 1/1/1 up {0=2=up:replay},
                2 up:standby

                     osdmap e83705: 16 osds: 16 up, 16 in; 26 remapped
                pgs

                      pgmap v40869228: 2496 pgs, 3 pools, 16952 GB data,
                5046 kobjects

                            32825 GB used, 11698 GB / 44524 GB avail

                            27139/10407227 objects degraded (0.261%)

                            168597/10407227 objects misplaced (1.620%)

                                2153 active+clean

                                 137 incomplete

                                 126 down+incomplete

                                  54 stale+active+clean

                                  15 active+remapped+backfill_toofull

                                  10
                active+undersized+degraded+remapped+backfill_toofull

                                   1 active+remapped

                [root@cnode0 ceph]#

                I wasn't able to find any solution in the Internet and I
                worry I will make things even worse when continue to
                troubleshoot this on my own. I'm stuck. Could you please
                help?

                Thanks.

                Andrzej

              _______________________________________________

              ceph-users mailing list

              ceph-users@xxxxxxxxxxxxxx

              http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com