Upgrading a "conservative" [tm] cluster from Hammer to Jewel, a nightmare in the making

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

This is going to have some level of ranting, bear with me as the points
are all valid and poignant.

Backstory:

I currently run 3 Ceph clusters all on Debian Jessie but with SysV init,
as they all predate any systemd supporting Ceph packages. 
- A crappy test one running Hammer, manually deployed (OSDs mounted via
  fstab, a mix of Ext4 and XFS), MBR/DOS partitions.
- Our main production cluster, also with fstab mounted OSDs, all Ext4.
  Extra bonus points for OSDs being entire, unpartitioned disk and journal
  holding SSDs being MBR/DOS partitioned.
- Non critical production cluster installed with ceph-deploy and GPT OSDs.

The later obviously is going to be the least problematic, though I'm sure
there will be enough entertainment.

Now we just finally got our real staging/testing cluster that actually
resembles the production ones so I was going to give it a few spins before
installing something that equals the production cluster.

First try was Hammer using ceph-deploy. 
Complete fail, due to lack of systemd unit files/targets:
---
[ceph-01][INFO  ] Running command: systemctl enable ceph.target
[ceph-01][WARNIN] Failed to execute operation: No such file or directory
---

Allrite, lets try with Jewel. 
This blew up in my face when trying to use previously created partitions
(GPT, mind ya), as documented here:
---
http://tracker.ceph.com/issues/13833
---
Incidentally ceph-deploy once again is trying to be too helpful, when I
gave it a "/dev/sda4" as journal target with the wrong GUID but a chown'ed
dev file it created things and linked to the partition as stated.

With partitions that have the correct GUID it will make a smarty-pants
link to /dev/disk/by-partuuid/ (good intention!), even when given a
"/dev/disk/by-id/" input. Oh well.

And ceph-deploy of course still activates new OSDs half of the time, the
other half it actually needs the activate step, thanks to udev I'm sure.

Now I'd like to repeat the question in the issue above, at what point
did GPT partitions (and udev magic) become mandatory?
And if it is indeed mandatory, where are the painless and data safe
transition tools?

I'll re-create the main production cluster (that is hammer, sysv-init,
fstab mounted OSDs) on the staging system next and see how what blows up
and how violently when trying a Jewel upgrade. 

My guess is that it won't be systemd (as Jewel actually has the targets
now), but the inability to deal with a manually deployed environment like
mine.

Expect news about that next week the latest.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux