Hi Todd, On Fri, 17 Nov 2023, Todd Zullinger wrote: > Johannes Schindelin wrote: > >> For checking links, a tool like linkcheker[1] is very handy. > >> This is run against the local docs in the Fedora package > >> builds to catch broken links. > > > > Hmm, `linkchecker` is really slow for me, even locally. > > Yeah, it took an hour and a half to run for me, both on an > old laptop and a fast server with plenty of threads, > bandwidth, and memory. > > Checking the git HTML documentation takes under 30 seconds, > which is largely the only place I've used it. It has been > very helpful in catching broken links in the docs during the > build and the time is short enough that I never minded. I found https://lychee.cli.rs/#/ in the meantime and figured out how to use it in a local setup: First, I run: HUGO_TIMEOUT=777 HUGO_BASEURL= HUGO_UGLYURLS=false time hugo The first `HUGO_*` setting is to make sure that even though I sometimes use all of the cores of my laptop's CPU it should not fail. The other two are to override settings from `hugo.yml` so that `lychee` can handle the output (`lychee` will not auto-append `.html`, unlike GitHub Pages, and would therefore mis-detect tons of broken links, without `HUGO_UGLYURLS=false`). In my setup, this command typically runs for something like half a minute, but sometimes takes for as long as 1 minute. (I noticed that it is much slower when I open the directory in VS Code because I'm running this in WSL and the filesystem watcher kind of eats all resources.) After that, I run: time lychee --offline --exclude-mail \ --exclude file:///path/to/repo.git/ \ --exclude file:///caminho/para/o/reposit%C3%B3rio.git/ \ --exclude file:///ruta/a/repositorio.git/ \ --exclude file:///sl%C3%B3%C3%B0/til/hirsla.git/ \ --exclude file:///Pfad/zum/Repo.git/ \ --exclude file:///chemin/du/d%C3%A9p%C3%B4t.git/ \ --exclude file:///srv/git/project.git \ --exclude "file://$PWD/public/pagefind/pagefind-ui.css" \ --format markdown -o lychee-local.md public/ Without `--offline`, there would be a couple of broken links (the http://git.or.cz/gitwiki/InterfacesFrontendsAndTools link leads to "Forbidden", it needs to be changed to https://). The `file:///` URLs are all examples that are not expected to be valid. And we do not want to check the emails (tons of `xyz@xxxxxxxxxxx` would be "broken"). This command typically takes another half minute, sometimes a bit longer. Given those times and the configurability (and the lure of a GitHub Action that could be easily integrated into a GitHub workflow: https://github.com/marketplace/actions/lychee-broken-link-checker), I have up on linkchecker and focused exclusively on lychee. Now, when I started working on this on Friday, lychee reported about 12,000 broken links. There were a couple of legitimate mistakes I made (when feeding paths to Hugo's `relURL` function, the path must not have a leading slash or it will remain unchanged, for example). These are fixed. But there were also many other issues such as some manual page translation being incomplete yet linking to not-yet-existing pages. In those cases, I changed he code to generate redirects to the English version. For example, https://git.github.io/git-scm.com/docs/git-clone/fr#_git has a link to `git[1]` that _should_ lead to the French version of the `git` manual page. However, that does not exist. So both the Rails App as well as the static website redirect to the English variant of that page. My most recent lychee run results in 0 broken links. As a bonus, some of the links that are currently broken on https://git-scm.com/ are fixed in https://git.github.io/git-scm.com/. For example, following the `Pull Request Referləri` link at the top of https://git-scm.com/book/az/v2/Appendix-C:-Git-%C6%8Fmrl%C9%99ri-Plumbing-%C6%8Fmrl%C9%99ri/ leads to a 404. But following it in https://git.github.io/git-scm.com/book/az/v2/Appendix-C:-Git-%C6%8Fmrl%C9%99ri-Plumbing-%C6%8Fmrl%C9%99ri/ directs the browser to the correct URL: https://git.github.io/git-scm.com/book/az/v2/GitHub-Bir-Layih%C9%99nin-Saxlan%C4%B1lmas%C4%B1/#_pr_refs Another thing that is broken on https://git-scm.com/ are the footnotes in the Czech translation of the ProGit book. These were broken in the Hugo version, too, but now they are fixed. See e.g. https://dscho.github.io/git-scm.com/book/cs/v2/Z%C3%A1klady-pr%C3%A1ce-se-syst%C3%A9mem-Git-Zobrazen%C3%AD-historie-reviz%C3%AD/#_footnotedef_7 and note that the Rails App redirects to https://git-scm.com/book/cs/v2/Z%C3%A1klady-pr%C3%A1ce-se-syst%C3%A9mem-Git-Zobrazen%C3%AD-historie-reviz%C3%AD/ch00/_footnotedef_7 when clicking on the `[7]`, which 404s. Could you double-check that the links in the current version? Thank you, Johannes