All, I have found an issue with ceph OSDs that are on a SAN and Multipathed. It may not matter that they are multipathed, but that is how our setup is where I found the issue. Our setup has an infiniband network which uses SRP to annunciate block devices on a DDN. Every LUN can be seen by every node that loads the SRP drivers. That would be my OSSes. I can create OSDs such that each node will have one OSD from what is available: ceph-deploy osd create ceph-1-35a:/dev/mapper/mpathb:/dev/sda5 \ ceph-1-35b:/dev/mapper/mpathc:/dev/sda5 \ ceph-1-36a:/dev/mapper/mpathd:/dev/sda5 \ ceph-1-36b:/dev/mapper/mpathe:/dev/sda5 This creates the OSD and puts the journal as partition 5 on a local SSD on each node. After moment, everything is happy: cluster b04e16d1-95d4-4f5f-8b32-318e7abbec56 health HEALTH_OK monmap e1: 3 mons at {gnas-1-35a=10.100.1.35:6789/0,gnas-1-35b=10.100.1.85:6789/0,gnas-1-36a=10.100.1.36:6789/0} election epoch 4, quorum 0,1,2 gnas-1-35a,gnas-1-36a,gnas-1-35b osdmap e19: 4 osds: 4 up, 4 in flags sortbitwise pgmap v39: 64 pgs, 1 pools, 0 bytes data, 0 objects 158 MB used, 171 TB / 171 TB avail 64 active+clean Now the problem is that when the system probes the devices, ceph automatically mounts ALL OSDs it sees: #df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VG1-root 20834304 1313172 19521132 7% / devtmpfs 132011116 0 132011116 0% /dev tmpfs 132023232 0 132023232 0% /dev/shm tmpfs 132023232 19040 132004192 1% /run tmpfs 132023232 0 132023232 0% /sys/fs/cgroup /dev/sda2 300780 126376 174404 43% /boot /dev/sda1 307016 9680 297336 4% /boot/efi /dev/mapper/VG1-tmp 16766976 33052 16733924 1% /tmp /dev/mapper/VG1-var 50307072 363196 49943876 1% /var /dev/mapper/VG1-log 50307072 37120 50269952 1% /var/log /dev/mapper/VG1-auditlog 16766976 33412 16733564 1% /var/log/audit tmpfs 26404648 0 26404648 0% /run/user/0 /dev/mapper/mpathb1 46026204140 41592 46026162548 1% /var/lib/ceph/osd/ceph-0 #partprobe /dev/mapper/mpathc #partprobe /dev/mapper/mpathd #partprobe /dev/mapper/mpathe #df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VG1-root 20834304 1313172 19521132 7% / devtmpfs 132011116 0 132011116 0% /dev tmpfs 132023232 0 132023232 0% /dev/shm tmpfs 132023232 19040 132004192 1% /run tmpfs 132023232 0 132023232 0% /sys/fs/cgroup /dev/sda2 300780 126376 174404 43% /boot /dev/sda1 307016 9680 297336 4% /boot/efi /dev/mapper/VG1-tmp 16766976 33052 16733924 1% /tmp /dev/mapper/VG1-var 50307072 363196 49943876 1% /var /dev/mapper/VG1-log 50307072 37120 50269952 1% /var/log /dev/mapper/VG1-auditlog 16766976 33412 16733564 1% /var/log/audit tmpfs 26404648 0 26404648 0% /run/user/0 /dev/mapper/mpathb1 46026204140 41592 46026162548 1% /var/lib/ceph/osd/ceph-0 /dev/mapper/mpathc1 46026204140 39912 46026164228 1% /var/lib/ceph/osd/ceph-1 /dev/mapper/mpathd1 46026204140 39992 46026164148 1% /var/lib/ceph/osd/ceph-2 /dev/mapper/mpathe1 46026204140 39964 46026164176 1% /var/lib/ceph/osd/ceph-3 Well that causes great grief and lockups… Is there a way within ceph to tell a particular OSS to ignore OSDs that aren’t meant for it? It’s odd to me that a mere partprobe causes the OSD to mount even. Brian Andrus ITACS/Research Computing Naval Postgraduate School Monterey, California voice: 831-656-6238 |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com