Question about recovery vs backfill priorities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


We're currently being hit by an issue with cluster recovery. The cluster size has been significantly extended (~50% new OSDs) and started recovery. During recovery there was a HW failure and we ended up with some PGS in peered state with size < min_size (inactive). Those peered PGs are waiting for backfill but the cluster still prefers recovery of recovery_wait PGs - in our case this could be even few hours before all recovery is finished (we're speeding up recovery up to limits to get the downtime as short as possible). Those peered PGs are blocked during this time and the whole cluster just struggles to operate at a reasonable level.

We're running hammer 0.94.6 there and from the code it looks like recovery will always have higher priority (jewel seems similar). Documentation only says that log-based recovery must finish before backfills. Is this requirement needed for data consistency or something else?

Ideally we'd like it to be this order: undersized inactive (size < min_size) recovery_wait => undersized inactive (size < min_size) wait_backfill => degraded recovery_wait => degraded wait_backfill => remapped wait_backfill. Changing priority calculation doesn't seem to be that hard but would it end up with inconsistent data?

Bartek

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux