Re: About lost disk with erasure code

Phong Tran Thanh <tranphong079@xxxxxxxxx> · Wed, 27 Dec 2023 10:54:11 +0700

Thank you for your knowledge. I have a question. Which pool is affected
when the PG is down, and how can I show it?
When a PG is down, is only one pool affected or are multiple pools affected?

Vào Th 3, 26 thg 12, 2023 vào lúc 16:15 Janne Johansson <
icepic.dz@xxxxxxxxx> đã viết:

> Den tis 26 dec. 2023 kl 08:45 skrev Phong Tran Thanh <
> tranphong079@xxxxxxxxx>:
> >
> > Hi community,
> >
> > I am running ceph with block rbd with 6 nodes, erasure code 4+2 with
> > min_size of pool is 4.
> >
> > When three osd is down, and an PG is state down, some pools is can't
> write
> > data, suppose three osd can't start and pg stuck in down state, how i can
> > delete or recreate pg to replace down pg or another way to allow pool to
> > write/read data?
>
>
> Depending on how the data is laid out in this pool, you might lose
> more or less all data from it.
>
> RBD images get split into pieces of 2 or 4M sizes, so that those
> pieces end up on different PGs,
> which in turn makes them end up on different OSDs and this allows for
> load balancing over the'
> whole cluster, but also means that if you lose some PGs on a 40G RBD
> image (made up of 10k
> pieces), chances are very high that the lost PG did contain one or
> more of those 10k pieces.
>
> So lost PGs would probably mean that every RBD image of decent sizes
> will have holes in them,
> and how this affects all the instances that mount the images will be
> hard to tell.
> If at all possible, try to use the offline OSD tools to try to get
> this PG out of one of the bad OSDs.
>
> https://hawkvelt.id.au/post/2022-4-5-ceph-pg-export-import/ might
> help, to see how to run
> the export + import commands.
>
> If you can get it out, it can be injected (imported) into any other
> running OSD and then replicas
> will be recreated and moved to where they should be.
>
> If you have disks to spare, make sure to do full copies of the broken
> OSDs and work in the copies
> instead, to maximize the chances of restoring your data.
>
> If you are very sure that these three OSDs are never coming back, and
> have marked the OSDs
> as lost, then I guess
>
> ceph pg force_create_pg <pgid>
>
> would be the next step to have the cluster create empty PGs to replace
> the lost ones, but I would
> consider this only after trying all the possible options for repairing
> at least one of the OSDs that held
> the PGs that are missing.
>
> --
> May the most significant bit of your life be positive.
>

-- 
Trân trọng,
----------------------------------------------------------------------------

*Tran Thanh Phong*

Email: tranphong079@xxxxxxxxx
Skype: tranphong079
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx