Re: Emergency, I lost 4 monitors but all osd disk are safe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I've just checked with the team and the situation is much more serious than
it seems: the lost disks contained the MONs AND OSDs databases (5 servers
down out of 8, replica 3).

It seems that the team fell victim to a bad batch of Samsung 980 Pros (I'm
not a big fan of this "Pro" range, but that's not the point), which have
never been able to restart since the incident.

Someone please correct me, but as far as I'm concerned, the cluster is lost.
________________________________________________________

Cordialement,

*David CASIER*




*Ligne directe: +33(0) 9 72 61 98 29*
________________________________________________________



Le jeu. 2 nov. 2023 à 15:49, Anthony D'Atri <aad@xxxxxxxxxxxxxx> a écrit :

> This admittedly is the case throughout the docs.
>
> > On Nov 2, 2023, at 07:27, Joachim Kraftmayer - ceph ambassador <
> joachim.kraftmayer@xxxxxxxxx> wrote:
> >
> > Hi,
> >
> > another short note regarding the documentation, the paths are designed
> for a package installation.
> >
> > the paths for container installation look a bit different e.g.:
> /var/lib/ceph/<fsid>/osd.y/
> >
> > Joachim
> >
> > ___________________________________
> > ceph ambassador DACH
> > ceph consultant since 2012
> >
> > Clyso GmbH - Premier Ceph Foundation Member
> >
> > https://www.clyso.com/
> >
> > Am 02.11.23 um 12:02 schrieb Robert Sander:
> >> Hi,
> >>
> >> On 11/2/23 11:28, Mohamed LAMDAOUAR wrote:
> >>
> >>>    I have 7 machines on CEPH cluster, the service ceph runs on a docker
> >>> container.
> >>>   Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
> >>>    During a reboot, the ssd bricked on 4 machines, the data are
> available on
> >>> the HDD disk but the nvme is bricked and the system is not available.
> is it
> >>> possible to recover the data of the cluster (the data disk are all
> >>> available)
> >>
> >> You can try to recover the MON db from the OSDs, as they keep a copy of
> it:
> >>
> >>
> https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#monitor-store-failures
> >>
> >> Regards
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux