On Sat, Oct 14, 2023 at 5:52 PM Tyler Stachecki <stachecki.tyler@xxxxxxxxx> wrote: > > On Sat, Oct 14, 2023 at 5:14 PM Dave Hall <kdhall@xxxxxxxxxxxxxx> wrote: > > > > Hello. > > > > It's been a while. For the past couple years I've had a cluster running > > Nautilus on Debian 10 using the Debian Ceph packages, and deployed with > > Ceph-Ansible. It's not a huge cluster - 10 OSD nodes with 80 x 12TB HDD > > OSDs, plus 3 management nodes, and about 40% full at the moment - but it is > > a critical resource for one of our researchers. > > > > Back then I had some misgivings about non-Debian packages and also about > > containerized Ceph. I don't know if my feelings about these things have > > changed that much, but it's time to upgrade, and, with the advent of > > cephadm it looks like it's just better to stay mainstream. > > I know it's probably the last thing you want to hear while looking > down the barrel of a gun for a research cluster running an outdated > version of Ceph -- but Debian 10 is *old*. You may be at the end of > the road on Nautilus until you upgrade. > > You can see this in that e.g., download.ceph.com provides buster > builds for Nautilus, but not Pacific: > https://download.ceph.com/debian-nautilus/dists/ > https://download.ceph.com/debian-pacific/dists/ Caught myself here - it's octopus that is the end of the road for buster. ceph.com builds for Quincy require bullseye/Debian 11. > Containerization would let you stay on Debian 10 by running Debian > 11/12 containers on top of it. That being said, I'm sure you know that > security updates ended 06/22 for Debian 10... not to mention the > Debian 10 kernel is getting long in the tooth, and containizeration > won't help you there. > > If you do proceed on with containerization, I would recommend > containerizing your existing Nautilus cluster before trying to > upgrade. Trying to do both container-ize an existing cluster while > upgrading it is really just asking for trouble. Ceph upgrades without > anything else moving around can be hard enough as it is! > > > So I'm looking for advice on how to get from where I'm at to at least > > Pacific or Quincy. > > Be wary that you should not upgrade across more than 2 releases at a > time per official documentation. If you're on Nautilus, you're looking > at Pacific tops unless you want to gamble a bit: > https://docs.ceph.com/en/latest/releases/quincy/#upgrading-from-pre-octopus-releases-like-nautilus > > > I've read a little in the last couple days. I've seen various opinions on > > (not) skipping releases and on when to switch to cephadm. I'm also > > concerned about cleaning up those old Debian packages - will there be a > > point where I can 'apt-get purge' them without harming the cluster. > > > > One particular thing: The upgrade instructions in various places on > > docs.ceph.com say something like > > > > Upgrade monitors by installing the new packages and restarting the monitor > > daemons. > > So the diddly here is that you have to upgrade Ceph components in a > precise order - all mons first, then all mgrs, etc. I imagine the > disclaimer you're reading is oriented towards the fact that lots of > folks run both mons and mgrs on the same host. If you apt-get > dist-upgrade on one "mon/mgr" host, you would effectively have a mgr > of release N+1 running prior to all mons being upgraded to N+1. The > *best-case* scenario for this is that the mgr will not start anymore > until the rest of the mons are upgraded. > > > To me this is kind of vague. Perhaps there is a different concept fo > > 'packages' within the cephadm environment. I could really use some > > clarification on this. > > > > I'd also consider decommissioning a few nodes, setting up a new cluster on > > fresh Debian installs. and migrating the data and remaining nodes. This > > would be a long and painful process - decommission a node, move it, move > > some data, decommission another node - and I don't know what effect it > > would have on external references to our object store. > > I think you may be overlooking something here -- there's no need to > move data to do that. You just set noout, rebuild a host, and then run > `ceph-volume lvm activate --all` after you're back up on Debian 11. > That command will scan for LVMs for OSDs and just prop everything back > up for you -- systemd units and all. There will be minimal recovery > activity to restore degraded objects that had writes to them while the > host was rebuilding, but it should be fairly minimal. > > > Please advise. > > I am more than willing to extend my expertise to my alma mater here -- > let me know if there's any way I can help. I have a heap of experience > in upgrading Ubuntu/Ceph clusters with no downtime. > > > Thanks. > > > > -Dave > > > > -- > > Dave Hall > > Binghamton University > > kdhall@xxxxxxxxxxxxxx > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx