Re: No recovery when "norebalance" flag set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Dan van der Ster (dan@xxxxxxxxxxxxxx):
> Haven't seen that exact issue.
> 
> One thing to note though is that if osd_max_backfills is set to 1,
> then it can happen that PGs get into backfill state, taking that
> single reservation on a given OSD, and therefore the recovery_wait PGs
> can't get a slot.
> I suppose that backfill prioritization is supposed to prevent this,
> but in my experience luminous v12.2.8 doesn't always get it right.

That's also our experience. Even if if the degraded PGs with backfill /
recovery state are given a higher priority (forced) ... than still
normally backfilling takes place.

> So next time I'd try injecting osd_max_backfills = 2 or 3 to kickstart
> the recovering PGs.

Wat still on "1" indeed. We tend to cranck that (and max recovery) with
keeping an eye on max read and write apply latency. In our setup we can
do 16 backfills concurrently / and or 2 recovery / 4 backfills. Recovery
speeds ~ 4 - 5 GB/s ... pushing it beyond that tends to crashing OSDs.

We'll try your suggestion next time.

Thanks,

Stefan

-- 
| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux