Sage,
The current workaround is bad because if an object has many clones
we have to ignore the osd_scrub_chunk_max and find a chunk that includes
the head object for those clones. Then the analysis code processes that
chunk in reverse order. We think that this behavior is the long
standing cause of performance issues during deep scrub in RBD
environments with lots of clones. An object with N clones will chunk
at least the N+1 objects no matter what the value of osd_scrub_chunk_max.
I have the analysis code modified to handle smaller arbitrary chunks
(state is saved between chunks) but that can only work if the objects
arrive in reverse order since it still requires the head objects first.
David
On 9/8/16 3:05 PM, Sage Weil wrote:
On Thu, 8 Sep 2016, David Zafman wrote:
Sage,
Does any code depend on collection_list returning snapshots BEFORE
head/snapdir? I'm trying to improve scrub's overhead per osd_scrub_chunk_max
of objects, but scrub for to do the snapshot consistency analysis it needs the
head objects first. Can we add a collection_list() that returns the objects
in completely reverse order? Or can it be changed to return head/snapdir
objects before the snapshots? The current code has to ignore
osd_scrub_chunk_max in order to find a natural boundary so that the scrub code
can go in reverse order for that segment.
collection_list has to return objects in ghobject_t sort order, so it's
really bool operator<(const ghobject_t& l, const ghobject_t& r)'s fault
that snaps come first. I don't think we can make it go backwards
efficiently given how rocksdb etc works.
It might be possible to change the ghobject_t sort order, though, but I
suspect it'll require a clusterwide osdmap flag again, similar to the
sortbitwise thing we did earlier. Blech.
How bad is the current workaround?
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html