Re: Proxmox/ceph upgrade and addition of a new node/OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi MJ (and all),

So we upgraded our Proxmox/Ceph cluster, and if we have to summarize the operation in a few words : overall, everything went well :)
The most critical operation of all is the 'osd crush tunables optimal', I talk about it in more detail after...

The Proxmox documentation is really well written and accurate and, normally, following the documentation step by step is almost sufficient !

* first step : upgrade Ceph Jewel to Luminous : https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous
(Note here : OSDs remain in FileStore backend, no BlueStore migration)

* second step : upgrade Proxmox version 4 to 5 : https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0

Just some numbers, observations and tips (based on our feedback, I'm not an expert !) :

* Before migration, make sure you are in the lastest version of Proxmox 4 (4.4-24) and Ceph Jewel (10.2.11)

* We don't use the pve repository for ceph packages but the official one (download.ceph.com). Thus, during the upgrade of Promox PVE, we don't replace ceph.com repository with promox.com Ceph repository...

* When you upgrade Ceph to Luminous (without tunables optimal), there is no impact on Proxmox 4. VMs are still running normally.
The side effect (non blocking for the functionning of VMs) is located in the GUI, on the Ceph menu : it can't report the status of the ceph cluster as it has a JSON formatting error (indeed the output of the command 'ceph -s' is completely different, really more readable on Luminous)

* It misses a little step in section 8 "Create Manager instances" of the upgrade ceph documentation. As the Ceph manager daemon is new since Luminous, the package doesn't exist on Jewel. So you have to install the ceph-mgr package on each node first before doing 'pveceph createmgr'

* The 'osd crush tunables optimal' operation is time consuming ! in our case : 5 nodes (PE R730xd), 58 OSDs, replicated (3/2) rbd pool with 2048 pgs and 2 millions objects, 22 TB used. The tunables operation took a little more than 24 hours !

* Really take the right time to make the 'tunables optimal' !

We encountered some pgs stuck and blocked requests during this operation. In our case, the involved OSDs were those with a high numbers of pgs (as they are high capacity disks).
The consequences can be critical since it can freeze some VMs (I guess those that replicas are stored on the stuck pgs ?).
The stuck state were corrected by rebooting the involved OSDs.
If you can move the disks of your critical VMs on another storage, so these VMs should not be impacted by the recovery (we moved some disks on another Ceph cluster and keep the conf in the Proxmox cluster being updated and there was no impact)

Otherwise :
- verify that all your VMs are recently backuped on an external storage (in case of Disaster recovery Plan !)
- if you can, stop all your non-critical VMs (in order to limit client io operations)
- if any, wait for the end of current backups then disable datacenter backup (in order to limit client io operations). !! do not forget to re-enable it when all is over !!
- if any and if no longer needed, delete your snapshots, it removes many useless objects !
- start the tunables operation outside of major activity periods (night, week-end, ??) and take into account that it can be very slow...

There are probably some options to configure in ceph to avoid 'pgs stuck' states, but on our side, as we previously moved our critical VM's disks, we didn't care about that !

* Anyway, the upgrade step of Proxmox PVE is done easily and quickly (just follow the documentation). Note that you can upgrade Proxmox PVE before doing the 'tunables optimal' operation.

Hoping that you will find this information useful, good luck with your very next migration !

Hervé

Le 13/09/2018 à 22:04, mj a écrit :
Hi Hervé,

No answer from me, but just to say that I have exactly the same upgrade path ahead of me. :-)

Please report here any tips, trics, or things you encountered doing the upgrades. It could potentially save us a lot of time. :-)

Thanks!

MJ

On 09/13/2018 05:23 PM, Hervé Ballans wrote:
Dear list,

I am currently in the process of upgrading Proxmox 4/Jewel to Proxmox5/Luminous.

I also have a new node to add to my Proxmox cluster.

What I plan to do is the following (from https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous):

* upgrade Jewel to Luminous

* let the "ceph osd crush tunables optimal " command run

* upgrade my proxmox to v5

* add the new node (already up to date in v5)

* add the new OSDs

* let ceph rebalance the lot


A couple of questions I have :

* would it be a good idea to add the new node+OSDs and run the "tunables optimal" command immediately after, which would maybe gain a little time and avoid two successive pg rebalancing ?

* did I miss anything in this plan?


Regards,
Hervé



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux