Hi everyone,
From November into January, we experienced a series of outages with the Ceph Community Infrastructure and its services:
Mailing lists
Sepia (testing infrastructure)
VPN to access testing services
Etherpad
Images:
Git mirror
These services are now mostly restored, but we did experience some data loss, notably in our mailing lists. We have restored them from backups, but subscription changes after July 2021 need to be repeated. If you subscribed or unsubscribed since then, please check your settings with the appropriate list at https://lists.ceph.io. If your posts to our mailing lists are now needing approval, that is also an indication that you need to re-subscribe to the appropriate lists.
Keep an eye out for emails with subject lines such as “Your message to ceph-users@xxxxxxx awaits moderator approval”.
When the community infrastructure was first created in late 2014, the VM cluster management software selected by the team came with the benefit of being widely entrenched and familiar to the lab administrators but didn't support Ceph as a storage backend at the time. As services grew, we relied more and more on its legacy storage solution, which was never migrated to Ceph. Over the last few months, this legacy storage solution had several instances of silent data corruption, rendering the VMs unbootable, taking down various services, and requiring restoration from backups in many cases.
We are moving these services to a more reliable, mostly container-based, infrastructure backed by Ceph, and planning for longer-term improvements to monitoring, backups, deployment, and other pieces of the project infrastructure.
This event highlights the need to better support the infrastructure. A handful of contributors have stepped up to restore these services, but we need an invested team focused.
If you or your company is looking for a great way to contribute to the Ceph community, this could be your opportunity. Please contact council@xxxxxxx if you can provide time to contribute to the Ceph Community Infrastructure and would like to join the team. You can also join the upstream #sepia slack channel to participate in these discussions using this link: https://join.slack.com/t/ceph-storage/shared_invite/zt-1n1eh6po5-PF9sokUSooOf1ZkVdqrPUQ
Unfortunately, these events have slowed down our upstream development and releases. We are currently working on publishing the next Pacific point release. The development freeze and release deadline for the Reef release will likely be pushed out, and more discussions to follow in the Ceph Leadership Team meetings.
_______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx