Re: recovering vs backfilling

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 10 Jan 2019 13:47:51 +0100

Hi Caspar,

On Thu, Jan 10, 2019 at 1:31 PM Caspar Smit <casparsmit@xxxxxxxxxxx> wrote:
>
> Hi all,
>
> I wanted to test Dan's upmap-remapped script for adding new osd's to a cluster. (Then letting the balancer gradually move pgs to the new OSD afterwards)

Cool. Insert "no guarantees or warranties" comment here.
And btw, I have noticed that the method doesn't always work in this
use case: sometimes the upmap balancer will not remove pg-upmap-items
entries if there are very few severely underloaded OSDs. The
calc_pg_upmaps might need some code changes to fully work in this
scenario.

> I've created a fresh (virtual) 12.2.10 4-node cluster with very small disks (16GB each). 2 OSD's per node.
> Put ~20GB of data on the cluster.
>
> Now when i set the norebalance flag and add a new OSD, 99% of pgs end up recovering or in recovery_wait. Only a few will be backfill_wait.
>
> The recovery starts as expected (norebalance only stops backfilling pgs) and finished eventually
>
> The upmap-remapped script only works with pgs which need to be backfilled.

The script looks for pgs which are remapped but not degraded, then
uses pg-upmap-items to move pgs from the state active+remapped to
active+clean.

> it does work for the handful of pgs in backfill_wait status but my question is:
>
> When is ceph doing recovery in stead of backfilling? Only when the cluster is rather empty or what is the criteria? Are the OSD's too small?

Vaguely, recovery is done when the pg log holds all of the changes
made on a pg while the pg was degraded.
Once that pg log was overfilled, the osd needs to backfill.
In other words: recovery is a process to replay a log of ops. But that
log has a size limit, so backfilling is the fallback to scan all
objects for changes.

Cheers, Dan
>
> Kind regards,
> Caspar
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com