Hi Sam & Sage, In the context of http://tracker.ceph.com/issues/9566 I'm inclined to think the best solution would be that the AsyncReserver choose a PG instead of just picking the next one in the list when there is a free slot. It would always choose a PG that must move to/from an OSDs for which there are more PGs waiting in the AsyncRerserver than any other OSD. The sort involved does not seem too expensive. Calculating priorities before adding the PG to the AsyncReserver seems wrong because the state of the system will change significantly while the PG is waiting to be processed. For instance the first PGs to be added have a low priority while the next have increasing priorities when they accumulate. If reservations are canceled because the OSD map changed again (maybe another OSD is decommissioned before recovery of the first one completes), you may end up having high priorities for PGs that are no longer associated with busy OSDs. That could backfire and create even more frequent long tails. What do you think ? Cheers -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature