errors after kernel-upgrade

Markus Goldberg <goldberg@xxxxxxxxxxxxxxxxx> · Tue, 10 Sep 2013 09:43:27 +0200



    Hi,

      i made a 'stop ceph-all' on my ceph-admin-host and then a
      kernel-upgrade from 3.9 to 3.11 on all of my 3 nodes.

      Ubuntu 13.04, ceph 0,68

      The kernel-upgrade required a reboot.

      Now after rebooting i get the following errors:

      
      root@bd-a:~# ceph -s

            cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b

             health HEALTH_WARN 133 pgs peering; 272 pgs
          stale; 265 pgs stuck unclean; 2 requests are blocked > 32
          sec; mds cluster is degraded

             monmap e1: 3 mons at
          {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
          election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2

             mdsmap e451467: 1/1/1 up
          {0=bd-0=up:replay}, 2 up:standby

             osdmap e464358: 3 osds: 3 up, 3 in

              pgmap v1343477: 792 pgs, 9 pools, 15145 MB
          data, 4986 objects

                    30927 MB used, 61372 GB / 61408 GB
          avail

                         387 active+clean

                         122 stale+active

                         140 stale+active+clean

                         133 peering

                          10 stale+active+replay

        
        root@bd-a:~# ceph -s

            cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b

             health HEALTH_WARN 6 pgs down; 377 pgs
          peering; 296 pgs stuck unclean; mds cluster is degraded

             monmap e1: 3 mons at {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
          election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2

             mdsmap e451467: 1/1/1 up
          {0=bd-0=up:replay}, 2 up:standby

             osdmap e464400: 3 osds: 3 up, 3 in

              pgmap v1343586: 792 pgs, 9 pools, 15145 MB
          data, 4986 objects

                    31046 MB used, 61372 GB / 61408 GB
          avail

                         142 active

                         270 active+clean

                           3 active+replay

                         371 peering

                           6 down+peering

        
        root@bd-a:~# ceph -s

            cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b

             health HEALTH_WARN 257 pgs peering; 359 pgs
          stuck unclean; 1 requests are blocked > 32 sec; mds cluster
          is degraded

             monmap e1: 3 mons at {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
          election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2

             mdsmap e451467: 1/1/1 up
          {0=bd-0=up:replay}, 2 up:standby

             osdmap e464403: 3 osds: 3 up, 3 in

              pgmap v1343594: 792 pgs, 9 pools, 15145 MB
          data, 4986 objects

                    31103 MB used, 61372 GB / 61408 GB
          avail

                         373 active

                         157 active+clean

                           5 active+replay

                         257 peering

        
        root@bd-a:~#

      
      As you can see above, the errors are changing, perhaps any
      selfrepair is on the run in the background. But this is since 12
      hours.

      What should i do ?

      
      Thank you,

        Markus

      Am 09.09.2013 13:52, schrieb Yan, Zheng:

    
      The bug has been fixed in 3.11 kernel by commit ccca4e37b1
      (libceph:
      fix truncate size calculation). We don't backport cephfs bug fixes
      to
      old kernel.
      please update the kernel or use ceph-fuse.
      Regards
      Yan, Zheng
      
        Best regards,
Tobi

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


      _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

    
    -- 
MfG,
  Markus Goldberg

------------------------------------------------------------------------
Markus Goldberg     | Universität Hildesheim
                    | Rechenzentrum
Tel +49 5121 883212 | Marienburger Platz 22, D-31141 Hildesheim, Germany
Fax +49 5121 883205 | email goldberg@xxxxxxxxxxxxxxxxx
------------------------------------------------------------------------

  
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com