Re: OSD in ceph.conf

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Mon, 11 May 2015 07:54:36 -0600

If you use ceph-disk (and I believe ceph-depoly) to create your OSDs, or you go through the manual steps to set up the partition UUIDs, then yes udev and the init script will do all the magic. Your disks can be moved to another box without problems. I've moved disks to different ports on controllers and it all worked just fine. I will be swapping the disks between two boxes today to try to get to the bottom of some problems we have been having, if it doesn't work I'll let you know.
The automagic of ceph OSDS has been refreshing for me because I was worried about having to manage so many disks and mount points, but it is much easier than I anticipated once I used ceph-disk.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
On May 11, 2015 5:32 AM, "Georgios Dimitrakakis" <giorgis@xxxxxxxxxxxx> wrote:
Hi Robert,

just to make sure I got it correctly:

Do you mean that the /etc/mtab entries are completely ignored and no matter what the order

of the /dev/sdX device is Ceph will just mount correctly the osd/ceph-X by default?

In addition, assuming that an OSD node fails for a reason other than a disk problem (e.g. mobo/ram)

if I put its disks on another OSD node (all disks have their journals with) will Ceph be able to mount

them correctly and continue its operation?

Regards,

George

I have not used ceph-deploy, but it should use ceph-disk for the OSD

preparation.  Ceph-disk creates GPT partitions with specific

partition UUIDS for data and journals. When udev or init starts the

OSD, or mounts it to a temp location reads the whoami file and the

journal, then remounts it in the correct location. There is no need

for fstab entries or the like. This allows you to easily move OSD

disks between servers (if you take the journals with it). Its magic! 

But I think I just gave away the secret.

Robert LeBlanc

Sent from a mobile device please excuse any typos.

On May 7, 2015 5:16 AM, "Georgios Dimitrakakis"  wrote:

Indeed it is not necessary to have any OSD entries in the Ceph.conf

file

but what happens in the event of a disk failure resulting in

changing the mount device?

For what I can see is that OSDs are mounted from entries in

/etc/mtab (I am on CentOS 6.6)

like this:

/dev/sdj1 /var/lib/ceph/osd/ceph-8 xfs rw,noatime,inode64 0 0

/dev/sdh1 /var/lib/ceph/osd/ceph-6 xfs rw,noatime,inode64 0 0

/dev/sdg1 /var/lib/ceph/osd/ceph-5 xfs rw,noatime,inode64 0 0

/dev/sde1 /var/lib/ceph/osd/ceph-3 xfs rw,noatime,inode64 0 0

/dev/sdi1 /var/lib/ceph/osd/ceph-7 xfs rw,noatime,inode64 0 0

/dev/sdf1 /var/lib/ceph/osd/ceph-4 xfs rw,noatime,inode64 0 0

/dev/sdd1 /var/lib/ceph/osd/ceph-2 xfs rw,noatime,inode64 0 0

/dev/sdk1 /var/lib/ceph/osd/ceph-9 xfs rw,noatime,inode64 0 0

/dev/sdb1 /var/lib/ceph/osd/ceph-0 xfs rw,noatime,inode64 0 0

/dev/sdc1 /var/lib/ceph/osd/ceph-1 xfs rw,noatime,inode64 0 0

So in the event of a disk failure (e.g. disk SDH fails) then in the

order the next one will take its place meaning that

SDI will be seen as SDH upon next reboot thus it will be mounted as

CEPH-6 instead of CEPH-7 and so on...resulting in a problematic

configuration (I guess that lots of data will be start moving

around, PGs will be misplaced etc.)

Correct me if I am wrong but the proper way to mount them would be

by using the UUID of the partition.

Is it OK if I change the entries in /etc/mtab using the UUID=xxxxxx

instead of /dev/sdX1??

Does CEPH try to mount them using a different config file and

perhaps exports the entries at boot in /etc/mtab (in the latter case

no modification in /etc/mtab will be taken into account)??

I have deployed the Ceph cluster using only the "ceph-deploy"

command. Is there a parameter that I ve missed that must be used

during deployment in order to specify the mount points using the

UUIDs instead of the device names?

Regards,

George

On Wed, 6 May 2015 22:36:14 -0600, Robert LeBlanc wrote:

We dont have OSD entries in our Ceph config. They are not needed

if

you dont have specific configs for different OSDs.

Robert LeBlanc

Sent from a mobile device please excuse any typos.

On May 6, 2015 7:18 PM, "Florent MONTHEL"  wrote:

Hi teqm,

Is it necessary to indicate in ceph.conf all OSD that we have

in the

cluster ?

we have today reboot a cluster (5 nodes RHEL 6.5) and some OSD

seem

to have change ID so crush map not mapped with the reality

Thanks

FLORENT MONTHEL

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx [1] [1]

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [2] [2]

Links:

------

[1] mailto:ceph-users@xxxxxxxxxxxxxx [3]

[2] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [4]

[3] mailto:florent.monthel@xxxxxxxxxxxxx [5]

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx [6]

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [7]

Links:

------

[1] mailto:ceph-users@xxxxxxxxxxxxxx

[2] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[3] mailto:ceph-users@xxxxxxxxxxxxxx

[4] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[5] mailto:florent.monthel@xxxxxxxxxxxxx

[6] mailto:ceph-users@xxxxxxxxxxxxxx

[7] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[8] mailto:giorgis@xxxxxxxxxxxx

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com