When you reformat the drive, it generates a new UUID so to Ceph it is as if it was a brand new drive. This does seem heavy handed, but ceph was designed for things to fail and it is not unusual to do things this way. Ceph is not RAID so you usually have to do some unthinking.
You could probably keep the UUID and the Auth key between reformats, but in my experience if is so easy to just have Ceph regenerate it, it's not worth the hassle of trying to keep track of it all.
In our testing we formatted the cluster over a dozen times without losing data. Because there wasn't much data on it we were able to format 40 OSDs in under 30 minutes (we formatted a while host at a time because we knew that was safe ) with a few little online scripts.
Short answer is don't be afraid to do it this way.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
I have been experimenting with Ceph, and have some OSDs with drives
containing XFS filesystems which I want to change to BTRFS.
(I started with BTRFS, then started again from scratch with XFS
[currently recommended] in order to eleminate that as a potential cause
of some issues, now with further experience, I want to go back to
BTRFS, but have data in my cluster and I don't want to scrap it).
This is exactly equivalent to the case in which I have an OSD with a
drive that I see is starting to error. I would in that case need to
replace the drive and recreate the Ceph structures on it.
So, I mark the OSD out, and the cluster automatically eliminates its
notion of data stored on the OSD and creates copies of the affected PGs
elsewhere to make the cluster healthy again.
All of the disk replacement instructions that I see then tell me to
then follow an OSD removal process:
"This procedure removes an OSD from a cluster map, removes its
authentication key, removes the OSD from the OSD map, and removes the
OSD from the ceph.conf file".
This seems to me to be too heavy-handed. I'm worried about doing this
and then effectively adding a new OSD where I have the same id number
as the OSD that I apparently unnecessarily removed.
I don't actually want to remove the OSD. The OSD is fine, I just want
to replace the disk drive that it uses.
This suggests that I really want to take the OSD out, allow the cluster
to get healthy again, then (replace the disk if this is due to
failure,) create a new BTRFS/XFS filesystem, remount the drive, then
recreate the Ceph structures on the disk to be compatible with the old
disk and the original OSD that it was attached to.
The OSD then gets marked back in, the cluster says "hello again, we
missed you, but its good to see you back, here are some PGs ...".
What I'm saying is that I really don't want to destroy the OSD, I want
to refresh it with a new disk/filesystem and put it back to work.
Is there some fundamental reason why this can't be done? If not, how
should I do it?
Best regards,
David
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com