On Thursday, June 2, 2022 2:50:55 PM EDT Junio C Hamano wrote: > Jason Yundt <jason@jasonyundt.email> writes: > > Subject: Re: [PATCH v2] gitweb: switch to an XHTML5 DOCTYPE > > > > According to the HTML Standard FAQ: > > “What is the DOCTYPE for modern HTML documents? > > > > ... > > Compared to the first version of this patch, this version: > > 1. makes it clear that XML parsers may used the linked DTD like brian > > > > mentioned. > > > > 2. mentions HTML5 like Bagas suggested. > > So, is it XHTML5, or HTML5, we want to see on the title? I chose XHTML5 since I didn’t think that it was accurate to say “HTML5 DOCTYPE”. The DOCTYPE that this patch uses is valid in the XML syntax, but not the HTML syntax. > > +proper_doctype() { > > + gitweb_run "$@" && > > + grep -F "<!DOCTYPE html [" gitweb.body && > > + grep "<!ENTITY nbsp" gitweb.body && > > + grep "<!ENTITY sdot" gitweb.body > > +} > > Hmph, this test does not care what other cruft appears in the file, > does not care in what order the three lines that match the patterns > appear, and the second and third patterns are even allowed to match > the same line. I think that is OK (we do not even mind if the two > ENTITY definitions get squashed on the same line). While I was writing this patch, I was thinking something similar. Grep is not a good tool for validating (X)HTML. I thought about creating a test that uses the Nu Html Checker [1] to validate pages that Gitweb generates, but I decided that that should be the topic of a separate patch. > > +test_expect_success 'Proper DOCTYPE with entity declarations' ' > > + proper_doctype && > > + proper_doctype "p=.git" && > > + proper_doctype "p=.git;a=log" && > > + proper_doctype "p=.git;a=tree" > > +' > > As far as I can tell, git_header_html() is the only helper that > deals with DOCTYPE, and responses to any request must call > git_header_html() to produce the header (or the handler for a > particular request type is buggy), but I do not think it is part of > this topic's job to ensure that all request handlers call the > git_header_html(). So we _could_ do with just a single test without > trying different request types if we wanted to, as long as there are > existing tests that make sure everybody uses git_header_html(). > > Was there a particular reason why these four requests were chosen? > Do they have different entry points and show the doctype from > different codepath? Not really. When I created a262585d81 (gitweb: remove invalid http- equiv="content-type", 2022-03-08), I chose those requests by running git instaweb and then clicking on the first four links I saw. For this patch, I just copied what I had done previously. I don’t know if they use different codepaths (I don’t understand Perl very well). > Thanks. [1]: <https://validator.w3.org/nu/>