Re: Migration of git-scm.com to a static web site: ready for review/testing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Todd,

On Fri, 17 Nov 2023, Todd Zullinger wrote:

> Johannes Schindelin wrote:
> > At this point, the patches are fairly robust and I am mainly hoping for
> > help with verifying that the static site works as intended, that existing
> > links will continue to work with the new site (essentially, find obscure
> > references to the existing website, then insert `git.github.io/` in the
> > URL and verify that it works as intended).
> >
> > To that end, I deployed this branch to GitHub Pages so that anyone
> > interested (hopefully many!) can have a look at
> > https://git.github.io/git-scm.com/ and compare to the existing
> > https://git-scm.com/.
>
> This is nice.  Thanks to all for working on it!

😊

> For checking links, a tool like linkcheker[1] is very handy.
> This is run against the local docs in the Fedora package
> builds to catch broken links.

Hmm, `linkchecker` is really slow for me, even locally.

> I ran it against the test site and it turned up _a lot_ of
> broken links.  [...]
>
>   URL        `ch00/ch10-git-internals'
>   Name       `Git Internals'
>   Parent URL https://git.github.io/git-scm.com/book/tr/v2/Ek-b%C3%B6l%C3%BCm-C:-Git-Commands-Plumbing-Commands/, line 106, col 1318
>   Real URL   https://git.github.io/git-scm.com/book/tr/v2/Ek-b%C3%B6l%C3%BCm-C:-Git-Commands-Plumbing-Commands/ch00/ch10-git-internals
>   Check time 3.303 seconds
>   Size       1KB
>   Result     Error: 404 Not Found

Good catch. I totally forgot to take care of the cross-references!

This is now fixed, as of
https://github.com/dscho/git-scm.com/commit/e599a57b2fadf8cb01e57af23fcb929b32e94bcb

I kicked off the GitHub workflow to re-generate the books, and the updated
GitHub Pages look fine (see e.g. the parent URL mentioned above and follow
the "Pull Request Refs" link).

> Running it against a local directory of the content would be
> a lot faster, if that's an option.  It's also worth bumping
> the default number of threads from 10 to increase the speed
> a bit.
>
> [1] https://linkchecker.github.io/linkchecker/

Unfortunately it is actually quite slow.

Granted, the added cross-references now increase the number of hyperlinks
to check, but after I let the program run for a bit over an hour to look
at https://git-scm.com/ (for comparison), it is now running on the local
build (i.e. the `public/` folder generated by Hugo, not even an HTTP
server) for over 45 minutes and still not done:

-- snip --
[...]
10 threads active, 112977 links queued, 206443 links in 100001 URLs checked, runtime 48 minutes, 46 seconds
10 threads active, 113455 links queued, 206689 links in 100001 URLs checked, runtime 48 minutes, 52 seconds
10 threads active, 113829 links queued, 206874 links in 100001 URLs checked, runtime 48 minutes, 57 seconds
10 threads active, 114230 links queued, 207136 links in 100001 URLs checked, runtime 49 minutes, 3 seconds
10 threads active, 114731 links queued, 207498 links in 100001 URLs checked, runtime 49 minutes, 9 seconds
-- snap --

Maybe something is going utterly wrong because the number of links seems
to be dramatically larger than what the https://git-scm.com/ reported;
Maybe linkchecker broke out of the `public/` directory and now indexes my
entire harddrive ;-)

Ciao,
Johannes

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux