Re: Scrub and collection_list() order

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Sage,


The current workaround is bad because if an object has many clones we have to ignore the osd_scrub_chunk_max and find a chunk that includes the head object for those clones. Then the analysis code processes that chunk in reverse order. We think that this behavior is the long standing cause of performance issues during deep scrub in RBD environments with lots of clones. An object with N clones will chunk at least the N+1 objects no matter what the value of osd_scrub_chunk_max.

I have the analysis code modified to handle smaller arbitrary chunks (state is saved between chunks) but that can only work if the objects arrive in reverse order since it still requires the head objects first.

David


On 9/8/16 3:05 PM, Sage Weil wrote:
On Thu, 8 Sep 2016, David Zafman wrote:
Sage,

Does any code depend on collection_list returning snapshots BEFORE
head/snapdir?  I'm trying to improve scrub's overhead per osd_scrub_chunk_max
of objects, but scrub for to do the snapshot consistency analysis it needs the
head objects first.  Can we add a collection_list() that returns the objects
in completely reverse order?  Or can it be changed to return head/snapdir
objects before the snapshots?  The current code has to ignore
osd_scrub_chunk_max in order to find a natural boundary so that the scrub code
can go in reverse order for that segment.
collection_list has to return objects in ghobject_t sort order, so it's
really bool operator<(const ghobject_t& l, const ghobject_t& r)'s fault
that snaps come first.  I don't think we can make it go backwards
efficiently given how rocksdb etc works.

It might be possible to change the ghobject_t sort order, though, but I
suspect it'll require a clusterwide osdmap flag again, similar to the
sortbitwise thing we did earlier. Blech.

How bad is the current workaround?

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux