Re: OSDs automount all devices on a san

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Fri, 20 May 2016 21:47:43 +0000 Andrus, Brian Contractor wrote:

> All,
> I have found an issue with ceph OSDs that are on a SAN and Multipathed.
> It may not matter that they are multipathed, but that is how our setup
> is where I found the issue.
> 

Your problem/issue is that Ceph is trying to be too smart and helpful for
it's own good.

What you're seeing is the udev magic that came with ceph-deploy.
ceph-deploy creates special GPT uuids and when the system sees them during
boot or partprobe  (ceph-disk list will do the trick, too), it activates
them.

Your solution is to not use ceph-deploy, and mount things manually. 

Your best bet for your existing setup is probably to change the IDs,
because neutering the udev rule is likely not permanent (I bet it will get
re-installed at the next upgrade).

Personally I have both manually deployed clusters where GPT wasn't an
option at the time or I wanted to use whole devices w/o partitions and one
very "clasic" one done with ceph-deploy.

It's easier (when it works that is) to set up, but simply not flexible
enough for every scenario.

Christian

> Our setup has an infiniband network which uses SRP to annunciate block
> devices on a DDN. Every LUN can be seen by every node that loads the SRP
> drivers. That would be my OSSes. I can create OSDs such that each node
> will have one OSD from what is available:
> 
> ceph-deploy osd create ceph-1-35a:/dev/mapper/mpathb:/dev/sda5 \
> ceph-1-35b:/dev/mapper/mpathc:/dev/sda5 \
> ceph-1-36a:/dev/mapper/mpathd:/dev/sda5 \
> ceph-1-36b:/dev/mapper/mpathe:/dev/sda5
> 
> This creates the OSD and puts the journal as partition 5 on a local SSD
> on each node. After moment, everything is happy:
> 
>     cluster b04e16d1-95d4-4f5f-8b32-318e7abbec56
>      health HEALTH_OK
>      monmap e1: 3 mons at
> {gnas-1-35a=10.100.1.35:6789/0,gnas-1-35b=10.100.1.85:6789/0,gnas-1-36a=10.100.1.36:6789/0}
> election epoch 4, quorum 0,1,2 gnas-1-35a,gnas-1-36a,gnas-1-35b osdmap
> e19: 4 osds: 4 up, 4 in flags sortbitwise
>       pgmap v39: 64 pgs, 1 pools, 0 bytes data, 0 objects
>             158 MB used, 171 TB / 171 TB avail
>                   64 active+clean
> 
> Now the problem is that when the system probes the devices, ceph
> automatically mounts ALL OSDs it sees: #df
> Filesystem                          1K-blocks       Used   Available
> Use% Mounted on /dev/mapper/VG1-root                 20834304
> 1313172    19521132   7% / devtmpfs
> 132011116          0   132011116   0% /dev
> tmpfs                               132023232          0   132023232
> 0% /dev/shm tmpfs                               132023232      19040
> 132004192   1% /run tmpfs
> 132023232          0   132023232
> 0% /sys/fs/cgroup /dev/sda2                              300780
> 126376      174404  43% /boot /dev/sda1
> 307016       9680      297336
> 4% /boot/efi /dev/mapper/VG1-tmp                  16766976      33052
> 16733924   1% /tmp /dev/mapper/VG1-var                  50307072
> 363196    49943876   1% /var /dev/mapper/VG1-log
> 50307072      37120    50269952
> 1% /var/log /dev/mapper/VG1-auditlog             16766976      33412
> 16733564   1% /var/log/audit tmpfs
> 26404648          0    26404648
> 0% /run/user/0 /dev/mapper/mpathb1               46026204140      41592
> 46026162548   1% /var/lib/ceph/osd/ceph-0
> 
> #partprobe /dev/mapper/mpathc
> #partprobe /dev/mapper/mpathd
> #partprobe /dev/mapper/mpathe
> #df
> Filesystem                          1K-blocks       Used   Available
> Use% Mounted on /dev/mapper/VG1-root                 20834304
> 1313172    19521132   7% / devtmpfs
> 132011116          0   132011116   0% /dev
> tmpfs                               132023232          0   132023232
> 0% /dev/shm tmpfs                               132023232      19040
> 132004192   1% /run tmpfs
> 132023232          0   132023232
> 0% /sys/fs/cgroup /dev/sda2                              300780
> 126376      174404  43% /boot /dev/sda1
> 307016       9680      297336
> 4% /boot/efi /dev/mapper/VG1-tmp                  16766976      33052
> 16733924   1% /tmp /dev/mapper/VG1-var                  50307072
> 363196    49943876   1% /var /dev/mapper/VG1-log
> 50307072      37120    50269952
> 1% /var/log /dev/mapper/VG1-auditlog             16766976      33412
> 16733564   1% /var/log/audit tmpfs
> 26404648          0    26404648
> 0% /run/user/0 /dev/mapper/mpathb1               46026204140      41592
> 46026162548
> 1% /var/lib/ceph/osd/ceph-0 /dev/mapper/mpathc1
> 46026204140      39912 46026164228
> 1% /var/lib/ceph/osd/ceph-1 /dev/mapper/mpathd1
> 46026204140      39992 46026164148
> 1% /var/lib/ceph/osd/ceph-2 /dev/mapper/mpathe1
> 46026204140      39964 46026164176   1% /var/lib/ceph/osd/ceph-3
> 
> Well that causes great grief and lockups...
> Is there a way within ceph to tell a particular OSS to ignore OSDs that
> aren't meant for it? It's odd to me that a mere partprobe causes the OSD
> to mount even.
> 
> 
> Brian Andrus
> ITACS/Research Computing
> Naval Postgraduate School
> Monterey, California
> voice: 831-656-6238
> 
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux