Re: errors after kernel-upgrade -- Help needed

Gregory Farnum <greg@xxxxxxxxxxx> · Mon, 16 Sep 2013 09:47:26 -0700



Obviously your OSDs aren't getting all the PGs up and running. Have
you followed the troubleshooting steps?
(http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Mon, Sep 16, 2013 at 6:35 AM, Markus Goldberg
<goldberg@xxxxxxxxxxxxxxxxx> wrote:
> Hi,
> i must ask once again. Is there really no help to this problem?
> The errors are still remaining. I can't mount the cluster anymore. All my
> data is gone.
> The error-messages are still changing every few seconds.
>
> What can i do ?
>
> Please help,
>   Markus
>
> Am 11.09.2013 08:39, schrieb Markus Goldberg:
>
> Does noone have an idea ?
> I can't mount the cluster anymore.
>
> Thank you,
>   Markus
>
> Am 10.09.2013 09:43, schrieb Markus Goldberg:
>
> Hi,
> i made a 'stop ceph-all' on my ceph-admin-host and then a kernel-upgrade
> from 3.9 to 3.11 on all of my 3 nodes.
> Ubuntu 13.04, ceph 0,68
> The kernel-upgrade required a reboot.
> Now after rebooting i get the following errors:
>
> root@bd-a:~# ceph -s
>     cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b
>      health HEALTH_WARN 133 pgs peering; 272 pgs stale; 265 pgs stuck
> unclean; 2 requests are blocked > 32 sec; mds cluster is degraded
>      monmap e1: 3 mons at
> {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
> election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2
>      mdsmap e451467: 1/1/1 up {0=bd-0=up:replay}, 2 up:standby
>      osdmap e464358: 3 osds: 3 up, 3 in
>       pgmap v1343477: 792 pgs, 9 pools, 15145 MB data, 4986 objects
>             30927 MB used, 61372 GB / 61408 GB avail
>                  387 active+clean
>                  122 stale+active
>                  140 stale+active+clean
>                  133 peering
>                   10 stale+active+replay
>
> root@bd-a:~# ceph -s
>     cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b
>      health HEALTH_WARN 6 pgs down; 377 pgs peering; 296 pgs stuck unclean;
> mds cluster is degraded
>      monmap e1: 3 mons at
> {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
> election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2
>      mdsmap e451467: 1/1/1 up {0=bd-0=up:replay}, 2 up:standby
>      osdmap e464400: 3 osds: 3 up, 3 in
>       pgmap v1343586: 792 pgs, 9 pools, 15145 MB data, 4986 objects
>             31046 MB used, 61372 GB / 61408 GB avail
>                  142 active
>                  270 active+clean
>                    3 active+replay
>                  371 peering
>                    6 down+peering
>
> root@bd-a:~# ceph -s
>     cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b
>      health HEALTH_WARN 257 pgs peering; 359 pgs stuck unclean; 1 requests
> are blocked > 32 sec; mds cluster is degraded
>      monmap e1: 3 mons at
> {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
> election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2
>      mdsmap e451467: 1/1/1 up {0=bd-0=up:replay}, 2 up:standby
>      osdmap e464403: 3 osds: 3 up, 3 in
>       pgmap v1343594: 792 pgs, 9 pools, 15145 MB data, 4986 objects
>             31103 MB used, 61372 GB / 61408 GB avail
>                  373 active
>                  157 active+clean
>                    5 active+replay
>                  257 peering
>
> root@bd-a:~#
>
> As you can see above, the errors are changing, perhaps any selfrepair is on
> the run in the background. But this is since 12 hours.
> What should i do ?
>
> Thank you,
>   Markus
> Am 09.09.2013 13:52, schrieb Yan, Zheng:
>
> The bug has been fixed in 3.11 kernel by commit ccca4e37b1 (libceph: fix
> truncate size calculation). We don't backport cephfs bug fixes to old
> kernel. please update the kernel or use ceph-fuse. Regards Yan, Zheng
>
> Best regards,
> Tobi
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> MfG,
>   Markus Goldberg
>
> ------------------------------------------------------------------------
> Markus Goldberg     | Universität Hildesheim
>                     | Rechenzentrum
> Tel +49 5121 883212 | Marienburger Platz 22, D-31141 Hildesheim, Germany
> Fax +49 5121 883205 | email goldberg@xxxxxxxxxxxxxxxxx
> ------------------------------------------------------------------------
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> MfG,
>   Markus Goldberg
>
> ------------------------------------------------------------------------
> Markus Goldberg     | Universität Hildesheim
>                     | Rechenzentrum
> Tel +49 5121 883212 | Marienburger Platz 22, D-31141 Hildesheim, Germany
> Fax +49 5121 883205 | email goldberg@xxxxxxxxxxxxxxxxx
> ------------------------------------------------------------------------
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> MfG,
>   Markus Goldberg
>
> ------------------------------------------------------------------------
> Markus Goldberg     | Universität Hildesheim
>                     | Rechenzentrum
> Tel +49 5121 883212 | Marienburger Platz 22, D-31141 Hildesheim, Germany
> Fax +49 5121 883205 | email goldberg@xxxxxxxxxxxxxxxxx
> ------------------------------------------------------------------------
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com