Hi André, On Sat, Jan 14, 2023 at 12:14 AM André de Freitas Smaira <afsmaira@xxxxxxxxx> wrote: > > Hello! > > Yesterday we found some errors in our cephadm disks, which is making it > impossible to access our HPC Cluster: > > # ceph health detail > HEALTH_WARN 3 failed cephadm daemon(s); insufficient standby MDS daemons > available > [WRN] CEPHADM_FAILED_DAEMON: 3 failed cephadm daemon(s) > daemon mds.cephfs.s1.nvopyf on s1.ceph.infra.ufscar.br is in error state > daemon mds.cephfs.s2.qikxmw on s2.ceph.infra.ufscar.br is in error state > daemon mds.cftv.s2.anybzk on s2.ceph.infra.ufscar.br is in error state > [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available > have 0; want 1 more What's in the MDS logs? > > Googling we found out that we should remove the failed MDS, but the data in > these disks is relatively important. We would like to know if we need to > remove it or if it can be fixed, and if we have to remove it if the data > will be lost. Please tell me if you need more information. > > Thanks in advance, > André de Freitas Smaira > Federal University of São Carlos - UFSCar > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx