On Fri, Apr 01, 2011 at 04:06:50PM -0600, Eric Blake wrote: > I'm still stumped by xsltproc complaining about not being a > valid XML entity, hence the (hackish) exemption in docs/Makefile.am > that adds --html for a couple of .html.in files. But for the > remaining files, this does make input validation stricter, and caught > several bugs. The only solution would be to add a DTD to the html.in which use any entity beyond the 5 ones hardcoded in the parser (lt, gt, amp, quot and apos. > Hence, this is an RFC (either we live with my hack that caught > all the issues in the prior patch, or someone with more xsltproc > knowledge than me will step in and teach it how to resolve html > entities while processing the documents as xml instead of html). We would need to add <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> to all the .html.in for consistency, since we now expect them to be well formed XML and possibly using html entities. > * docs/Makefile.am (maintainer-clean-local): Remove generated docs > in VPATH build. > (%.html.tmp): Don't use looser --html; our input should be strict > xhtml. HACK - use --html when entities like are involved. > (html/index.html): Exit on formatting problems. > (rebuild): Run full doc build on request. > --- > docs/Makefile.am | 24 ++++++++++++++---------- > 1 files changed, 14 insertions(+), 10 deletions(-) > > diff --git a/docs/Makefile.am b/docs/Makefile.am > index db4bc59..2d1afe4 100644 > --- a/docs/Makefile.am > +++ b/docs/Makefile.am > @@ -123,7 +123,7 @@ internals/%.html.tmp: internals/%.html.in subsite.xsl page.xsl sitemap.html.in > echo "Generating $@"; \ > $(MKDIR_P) "$(builddir)/internals"; \ > name=`echo $@ | sed -e 's/.tmp//'`; \ > - $(XSLTPROC) --stringparam pagename $$name --nonet --html \ > + $(XSLTPROC) --stringparam pagename $$name --nonet \ > $(top_srcdir)/docs/subsite.xsl $< > $@ \ > || { rm $@ && exit 1; }; fi > > @@ -131,7 +131,8 @@ internals/%.html.tmp: internals/%.html.in subsite.xsl page.xsl sitemap.html.in > @if [ -x $(XSLTPROC) ] ; then \ > echo "Generating $@"; \ > name=`echo $@ | sed -e 's/.tmp//'`; \ > - $(XSLTPROC) --stringparam pagename $$name --nonet --html \ > + $(XSLTPROC) --stringparam pagename $$name --nonet \ > + $$(grep -qE '&(nbsp|uuml|mdash);' $< && printf %s --html) \ > $(top_srcdir)/docs/site.xsl $< > $@ \ > || { rm $@ && exit 1; }; fi > > @@ -147,21 +148,22 @@ internals/%.html.tmp: internals/%.html.in subsite.xsl page.xsl sitemap.html.in > > > html/index.html: libvirt-api.xml newapi.xsl page.xsl sitemap.html.in > - -@if [ -x $(XSLTPROC) ] ; then \ > + @if [ -x $(XSLTPROC) ] ; then \ > echo "Rebuilding the HTML pages from the XML API" ; \ > $(XSLTPROC) --nonet -o $(srcdir)/ \ > $(srcdir)/newapi.xsl $(srcdir)/libvirt-api.xml ; fi > - -@if test -x $(XMLLINT) && test -x $(XMLCATALOG) ; then \ > - if $(XMLCATALOG) '$(XML_CATALOG_FILE)' "-//W3C//DTD XHTML 1.0 Strict//EN" \ > - > /dev/null ; then \ > + @if test -x $(XMLLINT) && test -x $(XMLCATALOG) ; then \ > + if $(XMLCATALOG) '$(XML_CATALOG_FILE)' \ > + "-//W3C//DTD XHTML 1.0 Strict//EN" > /dev/null ; then \ > echo "Validating the resulting XHTML pages" ; \ > SGML_CATALOG_FILES='$(XML_CATALOG_FILE)' \ > - $(XMLLINT) --catalogs --nonet --valid --noout $(srcdir)/html/*.html ; \ > + $(XMLLINT) --catalogs --nonet --valid --noout $(srcdir)/html/*.html \ > + || { rm $(srcdir)/$@ && exit 1; }; \ > else echo "missing XHTML1 DTD" ; fi ; fi > > $(addprefix $(srcdir)/,$(devhelphtml)): $(srcdir)/libvirt-api.xml $(devhelpxsl) > -@echo Rebuilding devhelp files > - -@if [ -x $(XSLTPROC) ] ; then \ > + @if [ -x $(XSLTPROC) ] ; then \ > $(XSLTPROC) --nonet -o $(srcdir)/devhelp/ \ > $(top_srcdir)/docs/devhelp/devhelp.xsl $(srcdir)/libvirt-api.xml ; fi > > @@ -183,9 +185,11 @@ clean-local: > rm -f *~ *.bak *.hierarchy *.signals *-unused.txt *.html > > maintainer-clean-local: clean-local > - rm -rf $(srcdir)/libvirt-api.xml $(srcdir)/libvirt-refs.xml todo.html.in > + rm -rf $(srcdir)/libvirt-api.xml $(srcdir)/libvirt-refs.xml \ > + todo.html.in $(srcdir)/*.html $(srcdir)/devhelp/*.html \ > + $(srcdir)/html/*.html $(srcdir)/internals/*.html > > -rebuild: api all > +rebuild: maintainer-clean-local api all > > install-data-local: > $(mkinstalldirs) $(DESTDIR)$(HTML_DIR) > -- > 1.7.4 I think I will check this in a few hours while preparing the release, depending on the size of the resulting diff, I may put this in, as is, add the DTD and go full XML or keep as-is ... Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@xxxxxxxxxxxx | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list