On Wed, Apr 05, 2023 at 01:18:57AM +0200, Mikael Öhman wrote:
Trying to upgrade a containerized setup from 16.2.10 to 16.2.11 gave us two big surprises, I wanted to share in case anyone else encounters the same. I don't see any nice solution to this apart from a new release that fixes the performance regression that completely breaks the container setup in cephadm due to timeouts: After some digging, we would that the it was the "ceph-volume" command that kept timing out, and after a ton of digging, found that it does so because of https://github.com/ceph/ceph/commit/bea9f4b643ce32268ad79c0fc257b25ff2f8333c#diff-29697ff230f01df036802c8b2842648267767b3a7231ea04a402eaf4e1819d29R104 which was introduced into 16.2.11. Unfortunately, the vital fix for this https://github.com/ceph/ceph/commit/8d7423c3e75afbe111c91e699ef3cb1c0beee61b was not included in 16.2.11 So, in a setup like ours, with *many* devices, a simple "ceph-volume raw list" now takes over 10 minutes to run (instead of 5 seconds in 16.2.10).
"Me too" https://lore.kernel.org/ceph-devel/ZAgb8KZ5NWEkAWWF@xxxxxxxxxxxx/ Chris _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx