Hi,
i made a 'stop ceph-all' on my ceph-admin-host and then a kernel-upgrade from 3.9 to 3.11 on all of my 3 nodes. Ubuntu 13.04, ceph 0,68 The kernel-upgrade required a reboot. Now after rebooting i get the following errors: root@bd-a:~# ceph -s cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b health HEALTH_WARN 133 pgs peering; 272 pgs stale; 265 pgs stuck unclean; 2 requests are blocked > 32 sec; mds cluster is degraded monmap e1: 3 mons at {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0}, election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2 mdsmap e451467: 1/1/1 up {0=bd-0=up:replay}, 2 up:standby osdmap e464358: 3 osds: 3 up, 3 in pgmap v1343477: 792 pgs, 9 pools, 15145 MB data, 4986 objects 30927 MB used, 61372 GB / 61408 GB avail 387 active+clean 122 stale+active 140 stale+active+clean 133 peering 10 stale+active+replay root@bd-a:~# ceph -s cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b health HEALTH_WARN 6 pgs down; 377 pgs peering; 296 pgs stuck unclean; mds cluster is degraded monmap e1: 3 mons at {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0}, election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2 mdsmap e451467: 1/1/1 up {0=bd-0=up:replay}, 2 up:standby osdmap e464400: 3 osds: 3 up, 3 in pgmap v1343586: 792 pgs, 9 pools, 15145 MB data, 4986 objects 31046 MB used, 61372 GB / 61408 GB avail 142 active 270 active+clean 3 active+replay 371 peering 6 down+peering root@bd-a:~# ceph -s cluster e0dbf70d-af59-42a5-b834-7ad739a7f89b health HEALTH_WARN 257 pgs peering; 359 pgs stuck unclean; 1 requests are blocked > 32 sec; mds cluster is degraded monmap e1: 3 mons at {bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0}, election epoch 782, quorum 0,1,2 bd-0,bd-1,bd-2 mdsmap e451467: 1/1/1 up {0=bd-0=up:replay}, 2 up:standby osdmap e464403: 3 osds: 3 up, 3 in pgmap v1343594: 792 pgs, 9 pools, 15145 MB data, 4986 objects 31103 MB used, 61372 GB / 61408 GB avail 373 active 157 active+clean 5 active+replay 257 peering root@bd-a:~# As you can see above, the errors are changing, perhaps any selfrepair is on the run in the background. But this is since 12 hours. What should i do ? Thank you, Markus Am 09.09.2013 13:52, schrieb Yan, Zheng: The bug has been fixed in 3.11 kernel by commit ccca4e37b1 (libceph: fix truncate size calculation). We don't backport cephfs bug fixes to old kernel. please update the kernel or use ceph-fuse. Regards Yan, ZhengBest regards, Tobi _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- MfG, Markus Goldberg ------------------------------------------------------------------------ Markus Goldberg | Universität Hildesheim | Rechenzentrum Tel +49 5121 883212 | Marienburger Platz 22, D-31141 Hildesheim, Germany Fax +49 5121 883205 | email goldberg@xxxxxxxxxxxxxxxxx ------------------------------------------------------------------------ |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com