Re: Incomplete PGs. Ceph Consultant Wanted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Pablo,

Since some PGs are empty and all OSDs are enabled, I'm not optimistic about
the future at all.

Was the command "ceph osd force-create-pg" executed with missing OSDs ?


Le lun. 17 juin 2024 à 17:26, cellosofia1@xxxxxxxxx <cellosofia1@xxxxxxxxx>
a écrit :

> Hi everyone,
>
> Thanks for your kind responses
>
> I know the following is not the best scenario, but sadly I didn't have the
> opportunity of installing this cluster
>
> More information about the problem:
>
> * We use replicated pools
> * Replica 2, min replicas 1.
> * Ceph version 17.2.0 (43e2e60a7559d3f46c9d53f1ca875fd499a1e35e) quincy
> (stable)
> * Virtual Machines setup, 2 MGR Nodes, 2 OSD Nodes, 4 VMs in total.
> * 27 OSDs right now
> * Rook environment: rook: v1.9.5
> * Kubernetes Server Version: v1.24.1
>
> I attach a .txt with the result of some diagnostic commands for reference
>
> What do you think?
>
> Regards
> Pablo
>
> On Mon, Jun 17, 2024 at 11:01 AM Matthias Grandl <matthias.grandl@xxxxxxxx>
> wrote:
>
>> Ah scratch that, my first paragraph about replicated pools is actually
>> incorrect. If it’s a replicated pool and it shows incomplete, it means the
>> most recent copy of the PG is missing. So ideal would be to recover the PG
>> from dead OSDs in any case if possible.
>>
>> Matthias Grandl
>> Head Storage Engineer
>> matthias.grandl@xxxxxxxx
>>
>> > On 17. Jun 2024, at 16:56, Matthias Grandl <matthias.grandl@xxxxxxxx>
>> wrote:
>> >
>> > Hi Pablo,
>> >
>> > It depends. If it’s a replicated setup, it might be as easy as marking
>> dead OSDs as lost to get the PGs to recover. In that case it basically just
>> means that you are below the pools min_size.
>> >
>> > If it is an EC setup, it might be quite a bit more painful, depending
>> on what happened to the dead OSDs and whether they are at all recoverable.
>> >
>> >
>> > Matthias Grandl
>> > Head Storage Engineer
>> > matthias.grandl@xxxxxxxx
>> >
>> >> On 17. Jun 2024, at 16:46, David C. <david.casier@xxxxxxxx> wrote:
>> >>
>> >> Hi Pablo,
>> >>
>> >> Could you tell us a little more about how that happened?
>> >>
>> >> Do you have a min_size >= 2 (or E/C equivalent) ?
>> >> ________________________________________________________
>> >>
>> >> Cordialement,
>> >>
>> >> *David CASIER*
>> >>
>> >> ________________________________________________________
>> >>
>> >>
>> >>
>> >> Le lun. 17 juin 2024 à 16:26, cellosofia1@xxxxxxxxx <
>> cellosofia1@xxxxxxxxx>
>> >> a écrit :
>> >>
>> >>> Hi community!
>> >>>
>> >>> Recently we had a major outage in production and after running the
>> >>> automated ceph recovery, some PGs remain in "incomplete" state, and IO
>> >>> operations are blocked.
>> >>>
>> >>> Searching in documentation, forums, and this mailing list archive, I
>> >>> haven't found yet if this means this data is recoverable or not. We
>> don't
>> >>> have any "unknown" objects or PGs, so I believe this is somehow an
>> >>> intermediate stage where we have to tell ceph which version of the
>> objects
>> >>> to recover from.
>> >>>
>> >>> We are willing to work with a Ceph Consultant Specialist, because the
>> data
>> >>> at stage is very critical, so if you're interested please let me know
>> >>> off-list, to discuss the details.
>> >>>
>> >>> Thanks in advance
>> >>>
>> >>> Best Regards
>> >>> Pablo
>> >>> _______________________________________________
>> >>> ceph-users mailing list -- ceph-users@xxxxxxx
>> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >>>
>> >> _______________________________________________
>> >> ceph-users mailing list -- ceph-users@xxxxxxx
>> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux