Re: LRC has slower recovery than Jerasure

Sage Weil <sage@xxxxxxxxxxxx> · Mon, 11 Sep 2017 20:49:36 +0000 (UTC)

On Mon, 11 Sep 2017, Oleg Kolosov wrote:
> Hi,
> I'm working on a new implementation of a new erasure code. During my
> development, I began testing the performance of my code & ceph LRC
> plugin vs jerasure.
> When setting up a cluster with failure domain being only leaf cells
> (osd) I saw the expected behaviour - recovery of LRC is faster than
> jerasure.
> Next I divided my cluster into 5 hosts, each containing 8 osds, and
> placed each local group under a different host (using ruleset-steps).
> Jerasure still had osd as failure domain. At this point I noticed that
> LRC has slower recovery. I've tried a configuration of k=4, l=3 and
> total of 9 shards.
> 
> I don't think it's collisions in crush, since it shouldn't have such a
> heavy effect.
> My suspicion is some sort of a throttle. I've tried to set the following:
>          osd recovery max active = 500
>         osd recovery op priority = 32
> 
> But I didn't see any significant improvement.
> 
> Can you think of a reason LRC would recover slower than jerasure when
> it's constrained to domains as I've described?

I don't have a specific answer for you, but one thing to keep in mind 
is that even with LRC, recovery is driven by the primary: it will 
reconstruct the shard(s) on the primary (reading over the network) and 
then push them out to the OSDs that need them.  That means that the 
benefits of localizing recovery to a local group in LRC is only realized 
if the primary happens to be part of that group.

Perhaps that helps explain the behavior you're seeing?

sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html