On Monday, March 7, 2022 7:23:49 AM EST Ævar Arnfjörð Bjarmason wrote: > I'm not sure I understand this change really. The result in always XML, > so application/xhtml+xml is redundant, text/html, or both? To be honest, using an http-equiv="content-type" in XHTML is confusing. When you do use one, your goal shouldn’t really be to specify the document’s MIME type. After all, the first three lines of each page say <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> Those lines are more than enough to determine that something is using XHTML and UTF-8. Instead, the idea is to help out a parser that is incorrectly parsing the document as HTML (instead of as XHTML). Historical W3C documents (that were applicable when http-equiv="content-type" was allowed in XHTML) [1] [2][3] indicate that http-equiv="content-type" should be used like this: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> In other words, to use http-equiv="content-type" properly in XHTML, you had to lie about the document’s type. The fact that this is confusing is probably part of why WHATWG disallowed it in the HTML Standard. > But aside from that: I have seen browsers get the lack of encoding="" > "wrong" with data at rest, don't some still default to ISO-8859-1? > > So won't this result in badly decoded data if you save the web page & > view it locally? I tested this idea in ungoogled-chromium, Firefox and Pale Moon. Other than Pale Moon in one specific circumstance, they all used UTF-8 as the encoding. Pale Moon used windows-1252, but only when the file ended with .html. When the file ended with .xhtml, Pale Moon used UTF-8. That being said, we don’t have to use an http-equiv="content-type" to fix the problem. Instead, we can use a <meta charset="utf-8"> which is allowed by the HTML Standard [4]. [1]: <https://www.w3.org/TR/xhtml1/#C_9> [2]: <https://www.w3.org/TR/html-polyglot/#character-encoding> [3]: <https://www.w3.org/Bugs/Public/show_bug.cgi?id=21818> [4]: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-charset>