Re: replace osd with Octopus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

> > When replacing an osd, there will be no PG remapping, and backfill
> > will restore the data on the new disk, right?
> 
> That depends on how you decide to go through the replacement process.
> Usually without your intervention (e.g. setting the appropriate OSD
> flags) the remapping will happen after an OSD goes down and out.

This has been unclear to me. Is OSD going to be marked out and PGs
going to be remapped during replacing? Or it depends on process?

When mark an OSD out, remapping will happen and it will take some time
for data migration. Is cluster in degraded state during such duration?

My understanding is that, remapping only happens when the OSD is marked
out. Replacement process will keep OSD always in, assuming replacing
with the same disk model.

In case to replace with different size, it could be more complicated,
because weight has to be adjusted for size change and PG may be
rebalanced.

> > In case of restoring a host with multiple OSDs, like WAL/DB SSD needs
> > to be replaced, I see two options.
> > 1) Keep cluster in degraded state and rebuild all OSDs.
> > 2) Mark those OSDs out so PGs are rebalanced, rebuild OSDs, and
> >    bring them back in to rebalance PGs again.
> 
> These are basically your options, yes.
> 
> > The key here is how much time backfilling and rebalancing will take?
> > The intention is to not keep cluster in degraded state for too long.
> > I assume they are similar, because either of them is to copy the same
> > amount of data?
> > If that's true, then option #2 is pointless.
> > Could anyone share such experiences, like how long time it takes to
> > recover how much data on what kind of networking/computing env?
> 
> No, option 2 is not pointless, it helps you prevent a degraded state.
> Having a small cluster or crush rules that only allow few failed OSDs it
> could be dangerous taking out an entire node, risking another failure
> and potential data loss. It highly depends on your specific setup and if
> you're willing to take the risk during rebuild of a node.
> The recovery/backfill speed is also depeneding on the size of OSDs, the
> object sizes, amount of data, etc. You would probably need to search the
> mailing list for examples from someone sharing their experience or so, I
> don't have captured such statistics.

My conclusion was based on two assumptions, correct me if they are wrong.
1) cluster is degraded during remapping.
2) no remapping when recovering an OSD.

For option #1, no remapping, just degraded state during recovering.
For option #2, remapping twice, one is to remap PG from old OSD to
others, another is to remap PG again when new OSD is in place.
It seems degrade state is twice longer with option #2 than #1.
Is that right?


Thanks!
Tony

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux