throttling PG removal

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Wed, 29 Jun 2016 18:40:48 +0200

Hi all,

We're deleting a large pool with lots of 4MB objects, and it's turning
out to be rather disruptive: >50 loadavg on osd machines, osd's
flapping, peering mess for the other pools.

In general, I'd like to be able to throttle this way back so that the
deleted PG objects slowly disappear without affecting IO to other
pools. And AFAICT there's no way to achieve that in current jewel or
master.

Here is the timing of the RemoveWQ on one of these OSDs:

2016-06-29 18:28:06.388530 7fee8a2e5700 12 OSD::disk_tp worker wq
OSD::RemoveWQ start processing 0x1 (1 active)
2016-06-29 18:28:07.954758 7fee8a2e5700 15 OSD::disk_tp worker wq
OSD::RemoveWQ done processing 0x1 (0 active)
2016-06-29 18:28:07.954766 7fee8a2e5700 12 OSD::disk_tp worker wq
OSD::RemoveWQ start processing 0x1 (1 active)
2016-06-29 18:28:10.599542 7fee8a2e5700 15 OSD::disk_tp worker wq
OSD::RemoveWQ done processing 0x1 (0 active)

I'd like to add a configurable osd_pg_remove_sleep, probably at the
top of OSD::RemoveWQ::_process ? Should that have the desired effect?

Moreover, it would be nice to be able to decrease the number of
objects removed per instance of the worker. It's currently limited to
osd_target_transaction_size (shared by build_past_intervals_parallel,
remove_dir, and handle_osd_map). Should it be safe to decrease that
(to, say, 5) or would it be better to introduce something specific to
remove_dir?

Thanks,

Dan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html