On Tue, 19 Mar 2024 at 17:59, Donald Hunter <donald.hunter@xxxxxxxxx> wrote: > > > > > I have an experimental fix that uses a dict for lookups. With the fix, I > > > > consistently get times in the sub 5 minute range: > > > > > > Fantastic! > > I pushed my performance changes to GitHub if you want to try them out: > > https://github.com/donaldh/sphinx/tree/c-domain-speedup Now a PR: https://github.com/sphinx-doc/sphinx/pull/12162 Following up on the incremental build performance, I have been experimenting with different batch sizes when building the Linux kernel docs. My results suggest that the best performance is achieved by using a minimum batch size of 200 for reads because batches smaller than that have too high a merge overhead back into the main process. I also experimented with a minimum threshold of 500 before even splitting into batches, i.e. if there are less than 500 changed docs then just process them serially. With the existing make_chunks behaviour, a small number of changed docs gives worst case behaviour of 1 doc per chunk. Merging single docs back into the main process destroys any benefit from the parallel processing. E.g. running make htmldocs SPHINXOPTS=-j12 Running Sphinx v7.2.6 [...] building [html]: targets for 3445 source files that are out of date updating environment: [new config] 3445 added, 0 changed, 0 removed [...] real 7m46.198s user 14m18.597s sys 0m54.925s for a full build of 3445 files vs an incremental build of just 114 files: Running Sphinx v7.2.6 [...] building [html]: targets for 114 source files that are out of date updating environment: 0 added, 114 changed, 0 removed real 5m50.746s user 6m33.199s sys 0m13.034s When I run the incremental build serially with make htmldocs SPHINXOPTS=-j1 then it is much faster: building [html]: targets for 114 source files that are out of date updating environment: 0 added, 114 changed, 0 removed real 1m5.034s user 1m3.183s sys 0m1.616s