Long peering - throttle at FileStore::queue_transactions

Guang Yang <guangyy@xxxxxxxxx> · Mon, 4 Jan 2016 15:32:19 -0800

Hi Cephers,
Happy New Year! I got question regards to the long PG peering..

Over the last several days I have been looking into the *long peering*
problem when we start a OSD / OSD host, what I observed was that the
two peering working threads were throttled (stuck) when trying to
queue new transactions (writing pg log), thus the peering process are
dramatically slow down.

The first question came to me was, what were the transactions in the
queue? The major ones, as I saw, included:

- The osd_map and incremental osd_map, this happens if the OSD had
been down for a while (in a large cluster), or when the cluster got
upgrade, which made the osd_map epoch the down OSD had, was far behind
the latest osd_map epoch. During the OSD booting, it would need to
persist all those osd_maps and generate lots of filestore transactions
(linear with the epoch gap).
> As the PG was not involved in most of those epochs, could we only take and persist those osd_maps which matter to the PGs on the OSD?

- There are lots of deletion transactions, and as the PG booting, it
needs to merge the PG log from its peers, and for the deletion PG
entry, it would need to queue the deletion transaction immediately.
> Could we delay the queue of the transactions until all PGs on the host are peered?

Thanks,
Guang
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html