Re: Incorrect crush map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Centos 7.2.

.. and i think i just figured it out. One node had directories from former OSDs in /var/lib/ceph/osd. When restarting other OSDs on this host, ceph apparently added those to the crush map, too.

[root@sm-cld-mtl-013 osd]# ls -la /var/lib/ceph/osd/
total 128
drwxr-x--- 8 ceph ceph  90 Feb 24 14:44 .
drwxr-x--- 9 ceph ceph 106 Feb 24 14:44 ..
drwxr-xr-x 2 root root   6 Jul  2  2015 ceph-42
drwxr-xr-x 2 root root   6 Jul  2  2015 ceph-43
drwxr-xr-x 1 root root 278 May  4 22:21 ceph-44
drwxr-xr-x 1 root root 278 May  4 22:21 ceph-45
drwxr-xr-x 1 root root 278 May  4 22:25 ceph-67
drwxr-xr-x 1 root root 304 May  4 22:25 ceph-86


(42 and 43 are on a different host.. yet when 'systemctl start ceph.target' is used, the osd preflight adds them to the crush map anyway:


May  4 22:13:26 sm-cld-mtl-013 ceph-osd: starting osd.67 at :/0 osd_data /var/lib/ceph/osd/ceph-67 /var/lib/ceph/osd/ceph-67/journal
May  4 22:13:26 sm-cld-mtl-013 ceph-osd: starting osd.45 at :/0 osd_data /var/lib/ceph/osd/ceph-45 /var/lib/ceph/osd/ceph-45/journal
May  4 22:13:26 sm-cld-mtl-013 ceph-osd: WARNING: will not setuid/gid: /var/lib/ceph/osd/ceph-42 owned by 0:0 and not requested 167:167
May  4 22:13:26 sm-cld-mtl-013 ceph-osd: 2016-05-04 22:13:26.529176 7f00cca7c900 -1 #033[0;31m ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-43: (2) No such file or directory#033[0m
May  4 22:13:26 sm-cld-mtl-013 ceph-osd: 2016-05-04 22:13:26.534657 7fb55c17e900 -1 #033[0;31m ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-42: (2) No such file or directory#033[0m
May  4 22:13:26 sm-cld-mtl-013 systemd: ceph-osd@43.service: main process exited, code=exited, status=1/FAILURE
May  4 22:13:26 sm-cld-mtl-013 systemd: Unit ceph-osd@43.service entered failed state.
May  4 22:13:26 sm-cld-mtl-013 systemd: ceph-osd@43.service failed.
May  4 22:13:26 sm-cld-mtl-013 systemd: ceph-osd@42.service: main process exited, code=exited, status=1/FAILURE
May  4 22:13:26 sm-cld-mtl-013 systemd: Unit ceph-osd@42.service entered failed state.
May  4 22:13:26 sm-cld-mtl-013 systemd: ceph-osd@42.service failed.



-Ben

On Tue, May 3, 2016 at 7:16 PM, Wade Holler <wade.holler@xxxxxxxxx> wrote:
Hi Ben, 

What OS+Version ?

Best Regards,
Wade


On Tue, May 3, 2016 at 2:44 PM Ben Hines <bhines@xxxxxxxxx> wrote:
My crush map keeps putting some OSDs on the wrong node. Restarting them fixes it temporarily, but they eventually hop back to the other node that they aren't really on. 

Is there anything that can cause this to look for?

Ceph 9.2.1

-Ben
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux