Multipath devices with infernalis

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



All,

 

I have a set of hardware with a few systems connected via IB along with a DDN SFA12K.

There are 4 IB/SRP paths to each block device. Those show up as /dev/mapper/mpath[b-d]

 

I am trying to do an initial install/setup of ceph on 3 nodes. Each will be a monitor as well as host a single OSD.

 

I am using the ceph-deploy to do most of the heavy lifting (using CentOS 7.2.1511).

 

Everything is quite successful installing monitors and even the first OSD.

 

ceph status shows:

    cluster 0d9e68e4-176d-4229-866b-d408f8055e5b

     health HEALTH_OK

     monmap e1: 3 mons at {ceph-1-35a=10.100.1.35:6789/0,ceph-1-35b=10.100.1.85:6789/0,ceph-1-36a=10.100.1.36:6789/0}

            election epoch 8, quorum 0,1,2 ceph-1-35a,ceph-1-36a,ceph-1-35b

     osdmap e5: 1 osds: 1 up, 1 in

            flags sortbitwise

      pgmap v8: 64 pgs, 1 pools, 0 bytes data, 0 objects

            40112 kB used, 43888 GB / 43889 GB avail

                  64 active+clean

 

But as soon as I try to add the next OSD on the next system using

ceph-deploy osd create ceph-1-35b:/dev/mapper/mpathc

things start acting up.

The last bit from the output seems ok:
[ceph-1-35b][INFO  ] checking OSD status...

[ceph-1-35b][INFO  ] Running command: ceph --cluster=ceph osd stat --format=json

[ceph-1-35b][WARNIN] there is 1 OSD down

[ceph-1-35b][WARNIN] there is 1 OSD out

[ceph_deploy.osd][DEBUG ] Host ceph-1-35b is now ready for osd use.

 

But, ceph status is now:

    cluster 0d9e68e4-176d-4229-866b-d408f8055e5b

     health HEALTH_OK

     monmap e1: 3 mons at {ceph-1-35a=10.100.1.35:6789/0,ceph-1-35b=10.100.1.85:6789/0,ceph-1-36a=10.100.1.36:6789/0}

            election epoch 8, quorum 0,1,2 ceph-1-35a,ceph-1-36a,ceph-1-35b

     osdmap e6: 2 osds: 1 up, 1 in

            flags sortbitwise

      pgmap v10: 64 pgs, 1 pools, 0 bytes data, 0 objects

            40120 kB used, 43888 GB / 43889 GB avail

                  64 active+clean

 

And ceph osd tree:

ID WEIGHT   TYPE NAME           UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 42.86040 root default

-2 42.86040     host ceph-1-35a

0 42.86040         osd.0            up  1.00000          1.00000

1        0 osd.1                  down        0          1.00000

 

I don’t understand why ceph-deploy didn’t activate this one when it did for the first. The OSD is not mounted on the other box.

I can try to activate the down OSD (ceph-deploy disk activate ceph-1-35b:/dev/mapper/mpathc1:/dev/mapper/mpathc2)

Things look good for a bit:

    cluster 0d9e68e4-176d-4229-866b-d408f8055e5b

     health HEALTH_OK

     monmap e1: 3 mons at {ceph-1-35a=10.100.1.35:6789/0,ceph-1-35b=10.100.1.85:6789/0,ceph-1-36a=10.100.1.36:6789/0}

            election epoch 8, quorum 0,1,2 ceph-1-35a,ceph-1-36a,ceph-1-35b

     osdmap e8: 2 osds: 2 up, 2 in

            flags sortbitwise

      pgmap v14: 64 pgs, 1 pools, 0 bytes data, 0 objects

            74804 kB used, 87777 GB / 87778 GB avail

                  64 active+clean

 

But after about 1 minute, it goes down:

ceph osd tree

ID WEIGHT   TYPE NAME           UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 85.72079 root default

-2 42.86040     host ceph-1-35a

0 42.86040         osd.0            up  1.00000          1.00000

-3 42.86040     host ceph-1-35b

1 42.86040         osd.1          down  1.00000          1.00000

 

ceph status

    cluster 0d9e68e4-176d-4229-866b-d408f8055e5b

    health HEALTH_WARN

            1/2 in osds are down

     monmap e1: 3 mons at {ceph-1-35a=10.100.1.35:6789/0,ceph-1-35b=10.100.1.85:6789/0,ceph-1-36a=10.100.1.36:6789/0}

            election epoch 8, quorum 0,1,2 ceph-1-35a,ceph-1-36a,ceph-1-35b

     osdmap e9: 2 osds: 1 up, 2 in

            flags sortbitwise

      pgmap v15: 64 pgs, 1 pools, 0 bytes data, 0 objects

            74804 kB used, 87777 GB / 87778 GB avail

                  64 active+clean

 

Has anyone played with getting multipath devices to work?
Of course it could be something completely different and I need to step back and see what step is failing. Any insight into where to dig would be appreciated.

 

Thanks in advance,

Brian Andrus

ITACS/Research Computing

Naval Postgraduate School

Monterey, California

voice: 831-656-6238

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux