Re: [PATCH v1] read-cache: speed up index load through parallelization

Duy Nguyen <pclouds@xxxxxxxxx> · Fri, 24 Aug 2018 20:20:37 +0200



On Fri, Aug 24, 2018 at 5:37 PM Duy Nguyen <pclouds@xxxxxxxxx> wrote:
> On Thu, Aug 23, 2018 at 10:36 PM Ben Peart <peartben@xxxxxxxxx> wrote:
> > > Nice to see this done without a new index extension that records
> > > offsets, so that we can load existing index files in parallel.
> > >
> >
> > Yes, I prefer this simpler model as well.  I wasn't sure it would
> > produce a significant improvement given the primary thread still has to
> > run through the variable length cache entries but was pleasantly surprised.
>
> Out of curiosity, how much time saving could we gain by recording
> offsets as an extension (I assume we need, like 4 offsets if the
> system has 4 cores)? Much much more than this simpler model (which may
> justify the complexity) or just "meh" compared to this?

To answer my own question, I ran a patched git to precalculate
individual thread parameters, removed the scheduler code and hard
coded these parameters (I ran just 4 threads, one per core). I got
0m2.949s (webkit.git, 275k files, 100 read-cache runs). Compared to
0m4.996s from Ben's patch (same test settings of course) I think it's
definitely worth adding some extra complexity.
-- 
Duy