Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx> writes: > As discussed (see Links), there is some inertia to move to the recent > Sphinx versions for the doc build environment. > > [...] > > Link: https://lore.kernel.org/linux-doc/874jf4m384.fsf@xxxxxxxxxxxx/ > Link: https://lore.kernel.org/linux-doc/20240226093854.47830-1-lukas.bulwahn@xxxxxxxxx/ > Reviewed-by: Akira Yokosawa <akiyks@xxxxxxxxx> > Tested-by: Vegard Nossum <vegard.nossum@xxxxxxxxxx> > Signed-off-by: Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx> > --- > v1 -> v2: > drop jinja2 as suggested by Vegard. > add tags from v1 review > > Documentation/doc-guide/sphinx.rst | 11 ++++++----- > Documentation/sphinx/requirements.txt | 7 ++----- > scripts/sphinx-pre-install | 19 +++---------------- > 3 files changed, 11 insertions(+), 26 deletions(-) Apologies if I am a little late to the party here - I am just catching up with the changes on docs-next. I went to install Sphinx 2.4.4 using requirements.txt for some doc work and hit the upstream Sphinx dependency breakage. So I pulled docs-next with the intention of sending a patch to requirements.txt with pinned dependences. When I noticed that things have already moved on in docs-next, I decided to spend some time investigating the performance regression that has been present in Sphinx from 3.0.0 until now. With Sphinx 2.4.4 I always get timings in this ballpark: % time make htmldocs ... real 4m5.417s user 17m0.379s sys 1m11.889s With Sphinx 7.2.6 it's typically over 9 minutes: % time make htmldocs ... real 9m0.533s user 15m38.397s sys 1m0.907s I collected profiling data using cProfile: export srctree=`pwd` export BUILDDIR=`pwd`/Documentation/output python3 -m cProfile -o profile.dat ./sphinx_latest/bin/sphinx-build \ -b html \ -c ./Documentation \ -d ./Documentation/output/.doctrees \ -D version=6.8.0 -D release= \ -D kerneldoc_srctree=. -D kerneldoc_bin=./scripts/kernel-doc \ ./Documentation \ ./Documentation/output Here's some of the profiling output: $ python3 -m pstats profile.dat Welcome to the profile statistics browser. profile.dat% sort tottime profile.dat% stats 10 Fri Mar 15 17:09:39 2024 profile.dat 3960680702 function calls (3696376639 primitive calls) in 1394.384 seconds Ordered by: internal time List reduced from 6733 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 770364892 165.102 0.000 165.102 0.000 sphinx/domains/c.py:153(__eq__) 104124 163.968 0.002 544.788 0.005 sphinx/domains/c.py:1731(_find_named_symbols) 543888397 123.767 0.000 176.685 0.000 sphinx/domains/c.py:1679(children_recurse_anon) 4292 74.081 0.017 74.081 0.017 {method 'poll' of 'select.poll' objects} 631233096 69.389 0.000 246.017 0.000 sphinx/domains/c.py:1746(candidates) 121406721/3359598 65.689 0.000 76.762 0.000 docutils/nodes.py:202(_fast_findall) 3477076 64.387 0.000 65.758 0.000 sphinx/util/nodes.py:633(_copy_except__document) 544032973 52.950 0.000 52.950 0.000 sphinx/domains/c.py:156(is_anon) 79012597/3430 36.395 0.000 36.395 0.011 sphinx/domains/c.py:1656(clear_doc) 286882978 31.271 0.000 31.279 0.000 {built-in method builtins.isinstance} profile.dat% callers c.py:153 Ordered by: internal time List reduced from 6733 to 4 due to restriction <'c.py:153'> Function was called by... ncalls tottime cumtime sphinx/domains/c.py:153(__eq__) <- 631153346 134.803 134.803 sphinx/domains/c.py:1731(_find_named_symbols) 154878 0.041 0.041 sphinx/domains/c.py:2085(find_identifier) 139056533 30.259 30.259 sphinx/domains/c.py:2116(direct_lookup) 135 0.000 0.000 sphinx/util/cfamily.py:89(__eq__) >From that you can see there is a significant call amplification from _find_named_symbols (100k calls) to __eq__ (630 million calls), plus several other expensive functions. Looking at the code [1], you can see why. It's doing a list walk to find matching symbols. When adding new symbols it does an exhaustive walk to check for duplicates, so you get worst-case performance, with ~13k symbols in a list during the doc build. I have an experimental fix that uses a dict for lookups. With the fix, I consistently get times in the sub 5 minute range: % time make htmldocs ... real 4m27.085s user 10m56.985s sys 0m56.385s I expect there are other speedups to be found. I will clean up my Sphinx changes and share them on a GitHub branch (as well as push them upstream) so that others can try them out. For some reason, if I run sphinx-build manually with -j 12 (I have a 12 core machine) I get better performance than make htmldocs: % sphinx-build -j 12 ... ... real 3m56.074s user 9m52.775s sys 0m52.905s I haven't had a chance to look at what makes the difference here, but will investigate when I have time. Cheers, Donald. [1] https://github.com/sphinx-doc/sphinx/blob/ff252861a7b295e8dd8085ea9f6ed85e085273fc/sphinx/domains/c/_symbol.py#L235-L283 > diff --git a/Documentation/sphinx/requirements.txt b/Documentation/sphinx/requirements.txt > index 5d47ed443949..5017f307c8a4 100644 > --- a/Documentation/sphinx/requirements.txt > +++ b/Documentation/sphinx/requirements.txt > @@ -1,6 +1,3 @@ > -# jinja2>=3.1 is not compatible with Sphinx<4.0 > -jinja2<3.1 > -# alabaster>=0.7.14 is not compatible with Sphinx<=3.3 > -alabaster<0.7.14 > -Sphinx==2.4.4 > +alabaster > +Sphinx > pyyaml