Re: Proxmox/ceph upgrade and addition of a new node/OSDs

mj <lists@xxxxxxxxxxxxx> · Fri, 21 Sep 2018 09:25:20 +0200

Hi Hervé!

Thanks for the detailed summary, much appreciated!

Best,
MJ

On 09/21/2018 09:03 AM, Hervé Ballans wrote:
Hi MJ (and all),

So we upgraded our Proxmox/Ceph cluster, and if we have to summarize the 
operation in a few words : overall, everything went well :)
The most critical operation of all is the 'osd crush tunables optimal', 
I talk about it in more detail after...

The Proxmox documentation is really well written and accurate and, 
normally, following the documentation step by step is almost sufficient !

* first step : upgrade Ceph Jewel to Luminous : 
https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous
(Note here : OSDs remain in FileStore backend, no BlueStore migration)

* second step : upgrade Proxmox version 4 to 5 : 
https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0

Just some numbers, observations and tips (based on our feedback, I'm not 
an expert !) :

* Before migration, make sure you are in the lastest version of Proxmox 
4 (4.4-24) and Ceph Jewel (10.2.11)

* We don't use the pve repository for ceph packages but the official one 
(download.ceph.com). Thus, during the upgrade of Promox PVE, we don't 
replace ceph.com repository with promox.com Ceph repository...

* When you upgrade Ceph to Luminous (without tunables optimal), there is 
no impact on Proxmox 4. VMs are still running normally.
The side effect (non blocking for the functionning of VMs) is located in 
the GUI, on the Ceph menu : it can't report the status of the ceph 
cluster as it has a JSON formatting error (indeed the output of the 
command 'ceph -s' is completely different, really more readable on Luminous)

* It misses a little step in section 8 "Create Manager instances" of the 
upgrade ceph documentation. As the Ceph manager daemon is new since 
Luminous, the package doesn't exist on Jewel. So you have to install the 
ceph-mgr package on each node first before doing 'pveceph createmgr'|||
|

* The 'osd crush tunables optimal' operation is time consuming ! in our 
case : 5 nodes (PE R730xd), 58 OSDs, replicated (3/2) rbd pool with 2048 
pgs and 2 millions objects, 22 TB used. The tunables operation took a 
little more than 24 hours !

* Really take the right time to make the 'tunables optimal' !

We encountered some pgs stuck and blocked requests during this 
operation. In our case, the involved OSDs were those with a high numbers 
of pgs (as they are high capacity disks).
The consequences can be critical since it can freeze some VMs (I guess 
those that replicas are stored on the stuck pgs ?).
The stuck state were corrected by rebooting the involved OSDs.
If you can move the disks of your critical VMs on another storage, so 
these VMs should not be impacted by the recovery (we moved some disks on 
another Ceph cluster and keep the conf in the Proxmox cluster being 
updated and there was no impact)

Otherwise :
- verify that all your VMs are recently backuped on an external storage 
(in case of Disaster recovery Plan !)
- if you can, stop all your non-critical VMs (in order to limit client 
io operations)
- if any, wait for the end of current backups then disable datacenter 
backup (in order to limit client io operations). !! do not forget to 
re-enable it when all is over !!
- if any and if no longer needed, delete your snapshots, it removes many 
useless objects !
- start the tunables operation outside of major activity periods (night, 
week-end, ??) and take into account that it can be very slow...

There are probably some options to configure in ceph to avoid 'pgs 
stuck' states, but on our side, as we previously moved our critical VM's 
disks, we didn't care about that !

* Anyway, the upgrade step of Proxmox PVE is done easily and quickly 
(just follow the documentation). Note that you can upgrade Proxmox PVE 
before doing the 'tunables optimal' operation.

Hoping that you will find this information useful, good luck with your 
very next migration !

Hervé

Le 13/09/2018 à 22:04, mj a écrit :
Hi Hervé,

No answer from me, but just to say that I have exactly the same 
upgrade path ahead of me. :-)

Please report here any tips, trics, or things you encountered doing 
the upgrades. It could potentially save us a lot of time. :-)

Thanks!

MJ

On 09/13/2018 05:23 PM, Hervé Ballans wrote:
Dear list,

I am currently in the process of upgrading Proxmox 4/Jewel to 
Proxmox5/Luminous.

I also have a new node to add to my Proxmox cluster.

What I plan to do is the following (from 
https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous):

* upgrade Jewel to Luminous

* let the "ceph osd crush tunables optimal " command run

* upgrade my proxmox to v5

* add the new node (already up to date in v5)

* add the new OSDs

* let ceph rebalance the lot

A couple of questions I have :

* would it be a good idea to add the new node+OSDs and run the 
"tunables optimal" command immediately after, which would maybe gain 
a little time and avoid two successive pg rebalancing ?

* did I miss anything in this plan?

Regards,
Hervé

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com