Apache Content-Type nightmare

Apache, HTML, Unix No Comments »

The problem:

  1. Debian has configured Apache such that it will add a Content-Type: … charset=iso-8859-1 to the HTTP request headers of all files with unknown types. This overrides my <meta http-equiv…charset=utf-8> line in my website which sets the character set to UTF-8, and thus breaks the handling of extended ASCII characters (making résumé appear incorrectly). I would consider disabling it, but it does exist for a reason. It is also the default for Apache 2.0.
  2. My XSLTs are configured NOT to include the <?xml version=”1.0″ charset=”…”?> stanza at the beginning of my webpages (more below). When I make my XSLTs produce ISO-8859-1 output without this stanza, my output validation stage fails because the document is not UTF-8. It suggests to use the <?xml…?> stanza to specify the character set.
  3. When I output the <?xml…?> stanza, Opera and IE do not display the page correctly. IE also has a bug where it won’t turn on strict conformance mode (to eliminate CSS bugs) if the <?xml…?> stanza exists.

The solutions seem to be:

  1. Eliminate all extended ASCII from output, and replace it with character references (such as &eacute;). This is obviously evil.
  2. Disable Apache’s Content-Type HTTP header crap. This is evil: see above.
  3. Forget about the output validation stage. Evil.
  4. Generate ISO-8859-1 with the XML stanza and add an extra stage after output validation that strips off the <?xml…?> stanza. Evil.
  5. Try to find a way to set the content-type of files so Apache sends the proper content-type in the HTTP headers. If done with .htaccess files, it will be a big PITA.
  6. Eliminate non-7bit ASCII altogether. Ugh.

What a mess.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in