On Fri, Aug 24, 2018 at 5:37 PM Duy Nguyen <pclouds@xxxxxxxxx> wrote: > On Thu, Aug 23, 2018 at 10:36 PM Ben Peart <peartben@xxxxxxxxx> wrote: > > > Nice to see this done without a new index extension that records > > > offsets, so that we can load existing index files in parallel. > > > > > > > Yes, I prefer this simpler model as well. I wasn't sure it would > > produce a significant improvement given the primary thread still has to > > run through the variable length cache entries but was pleasantly surprised. > > Out of curiosity, how much time saving could we gain by recording > offsets as an extension (I assume we need, like 4 offsets if the > system has 4 cores)? Much much more than this simpler model (which may > justify the complexity) or just "meh" compared to this? To answer my own question, I ran a patched git to precalculate individual thread parameters, removed the scheduler code and hard coded these parameters (I ran just 4 threads, one per core). I got 0m2.949s (webkit.git, 275k files, 100 read-cache runs). Compared to 0m4.996s from Ben's patch (same test settings of course) I think it's definitely worth adding some extra complexity. -- Duy