Hi André, Firstly, thank you very much for your email - the speed with which you responded is much appreciated. I am using Notepad purely to simplify and focus on the problem at hand. The actual HTML files are created from a Web Publishing system that uses XML and XSL. The user populates the XML via an Applet and when they save the file it is automatically transformed using the XSL into HTML. These final pages exhibit the same problem I have described when using Notepad. And yes, the .shtml file does include the Meta tag you describe! Regards Christopher Biggs ----- Original Message ----- From: "André Warnier" <aw@xxxxxxxxxx> To: users@xxxxxxxxxxxxxxxx Sent: Wednesday, 7 October, 2009 09:55:33 GMT +00:00 GMT Britain, Ireland, Portugal Subject: Re: Using SSI to include a UTF-8 encoded file causes a strange character to be sent to the browser Hi. Chris Biggs wrote: ... > When these files are saved as "ANSI" (using Notepad) (or rather in this case, as UTF-8) Tips : 1) *don't use Notepad to edit HTML pages*. Use a real editor, properly aware of character sets and encodings, and which will highlight incorrect UTF-8 characters. Notepad has a big problem when saving UTF-8 encoded files : it writes a "BOM" at the beginning of the file, which is not only totally unnecessary for UTF-8, but also confuses other programs. A BOM is a sequence of 2 or 3 bytes, meant in some cases to indicate the "byte order" of the file that follows. For UTF-8, there is only one valid byte order, so the BOM is not necessary and could/should be ignored. However, when such a file with a BOM prefix is being included by some software in the middle of another file (as you do with SSI), it usually causes the kind of problem you are seeing : "bizarre" characters in the middle. 2) use a proper <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> in the <head> section of your html files. That should tell the browser what the encoding of the page is. 3) But this is really only a substitute for the real standard-conformant way of indicating the encoding to the browser : the webserver should send, with each html page, a HTTP header like : Content-type: text/html; charset=UTF-8 Unfortunately, MS's IE (all versions and sub-versions) have a long history of ignoring or misinterpreting this part of the HTTP RFC, and deciding themselves what content the document has. This is *wrong*, but unfortunately also, in the real world IE is much used, so one has to learn to work around this. --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx