Hello, On Fri, 20 May 2016 21:47:43 +0000 Andrus, Brian Contractor wrote: > All, > I have found an issue with ceph OSDs that are on a SAN and Multipathed. > It may not matter that they are multipathed, but that is how our setup > is where I found the issue. > Your problem/issue is that Ceph is trying to be too smart and helpful for it's own good. What you're seeing is the udev magic that came with ceph-deploy. ceph-deploy creates special GPT uuids and when the system sees them during boot or partprobe (ceph-disk list will do the trick, too), it activates them. Your solution is to not use ceph-deploy, and mount things manually. Your best bet for your existing setup is probably to change the IDs, because neutering the udev rule is likely not permanent (I bet it will get re-installed at the next upgrade). Personally I have both manually deployed clusters where GPT wasn't an option at the time or I wanted to use whole devices w/o partitions and one very "clasic" one done with ceph-deploy. It's easier (when it works that is) to set up, but simply not flexible enough for every scenario. Christian > Our setup has an infiniband network which uses SRP to annunciate block > devices on a DDN. Every LUN can be seen by every node that loads the SRP > drivers. That would be my OSSes. I can create OSDs such that each node > will have one OSD from what is available: > > ceph-deploy osd create ceph-1-35a:/dev/mapper/mpathb:/dev/sda5 \ > ceph-1-35b:/dev/mapper/mpathc:/dev/sda5 \ > ceph-1-36a:/dev/mapper/mpathd:/dev/sda5 \ > ceph-1-36b:/dev/mapper/mpathe:/dev/sda5 > > This creates the OSD and puts the journal as partition 5 on a local SSD > on each node. After moment, everything is happy: > > cluster b04e16d1-95d4-4f5f-8b32-318e7abbec56 > health HEALTH_OK > monmap e1: 3 mons at > {gnas-1-35a=10.100.1.35:6789/0,gnas-1-35b=10.100.1.85:6789/0,gnas-1-36a=10.100.1.36:6789/0} > election epoch 4, quorum 0,1,2 gnas-1-35a,gnas-1-36a,gnas-1-35b osdmap > e19: 4 osds: 4 up, 4 in flags sortbitwise > pgmap v39: 64 pgs, 1 pools, 0 bytes data, 0 objects > 158 MB used, 171 TB / 171 TB avail > 64 active+clean > > Now the problem is that when the system probes the devices, ceph > automatically mounts ALL OSDs it sees: #df > Filesystem 1K-blocks Used Available > Use% Mounted on /dev/mapper/VG1-root 20834304 > 1313172 19521132 7% / devtmpfs > 132011116 0 132011116 0% /dev > tmpfs 132023232 0 132023232 > 0% /dev/shm tmpfs 132023232 19040 > 132004192 1% /run tmpfs > 132023232 0 132023232 > 0% /sys/fs/cgroup /dev/sda2 300780 > 126376 174404 43% /boot /dev/sda1 > 307016 9680 297336 > 4% /boot/efi /dev/mapper/VG1-tmp 16766976 33052 > 16733924 1% /tmp /dev/mapper/VG1-var 50307072 > 363196 49943876 1% /var /dev/mapper/VG1-log > 50307072 37120 50269952 > 1% /var/log /dev/mapper/VG1-auditlog 16766976 33412 > 16733564 1% /var/log/audit tmpfs > 26404648 0 26404648 > 0% /run/user/0 /dev/mapper/mpathb1 46026204140 41592 > 46026162548 1% /var/lib/ceph/osd/ceph-0 > > #partprobe /dev/mapper/mpathc > #partprobe /dev/mapper/mpathd > #partprobe /dev/mapper/mpathe > #df > Filesystem 1K-blocks Used Available > Use% Mounted on /dev/mapper/VG1-root 20834304 > 1313172 19521132 7% / devtmpfs > 132011116 0 132011116 0% /dev > tmpfs 132023232 0 132023232 > 0% /dev/shm tmpfs 132023232 19040 > 132004192 1% /run tmpfs > 132023232 0 132023232 > 0% /sys/fs/cgroup /dev/sda2 300780 > 126376 174404 43% /boot /dev/sda1 > 307016 9680 297336 > 4% /boot/efi /dev/mapper/VG1-tmp 16766976 33052 > 16733924 1% /tmp /dev/mapper/VG1-var 50307072 > 363196 49943876 1% /var /dev/mapper/VG1-log > 50307072 37120 50269952 > 1% /var/log /dev/mapper/VG1-auditlog 16766976 33412 > 16733564 1% /var/log/audit tmpfs > 26404648 0 26404648 > 0% /run/user/0 /dev/mapper/mpathb1 46026204140 41592 > 46026162548 > 1% /var/lib/ceph/osd/ceph-0 /dev/mapper/mpathc1 > 46026204140 39912 46026164228 > 1% /var/lib/ceph/osd/ceph-1 /dev/mapper/mpathd1 > 46026204140 39992 46026164148 > 1% /var/lib/ceph/osd/ceph-2 /dev/mapper/mpathe1 > 46026204140 39964 46026164176 1% /var/lib/ceph/osd/ceph-3 > > Well that causes great grief and lockups... > Is there a way within ceph to tell a particular OSS to ignore OSDs that > aren't meant for it? It's odd to me that a mere partprobe causes the OSD > to mount even. > > > Brian Andrus > ITACS/Research Computing > Naval Postgraduate School > Monterey, California > voice: 831-656-6238 > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com