Solution to the Apache Content-Type nightmare

Apache No Comments »

I’ve found a solution that I deem acceptable for the Apache Content-Type nightmare I described earlier. Since .htaccess files apply not only to their own directories but to all subdirectories as well, I added the following stanza to my .htaccess file to force all types for files ending in HTML to UTF-8:

# .htaccess

AddCharset UTF-8 .html

I also have used a trick from here which describes how to serve up the MIME type application/xhtml+xml, the recommended (and eventually required) MIME type for XHTML pages, to browsers that can support it. Namely:

# .htaccess continued
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_ACCEPT} application/xhtml+xml
RewriteCond %{HTTP_ACCEPT} !application/xhtml+xmls*;s*q=0
RewriteCond %{REQUEST_URI} .html$
RewriteCond %{THE_REQUEST} HTTP/1.1
RewriteRule .* - [T=application/xhtml+xml]

Note to self

Make No Comments »

When using GNU Make, be sure to use the $(CURDIR) variable instead of the $(PWD) variable, as the former properly handles recursive builds when using -C. In fact, the latter isn’t even documented in the manual.

Apache Content-Type nightmare

Apache, HTML, Unix No Comments »

The problem:

  1. Debian has configured Apache such that it will add a Content-Type: … charset=iso-8859-1 to the HTTP request headers of all files with unknown types. This overrides my <meta http-equiv…charset=utf-8> line in my website which sets the character set to UTF-8, and thus breaks the handling of extended ASCII characters (making résumé appear incorrectly). I would consider disabling it, but it does exist for a reason. It is also the default for Apache 2.0.
  2. My XSLTs are configured NOT to include the <?xml version=”1.0″ charset=”…”?> stanza at the beginning of my webpages (more below). When I make my XSLTs produce ISO-8859-1 output without this stanza, my output validation stage fails because the document is not UTF-8. It suggests to use the <?xml…?> stanza to specify the character set.
  3. When I output the <?xml…?> stanza, Opera and IE do not display the page correctly. IE also has a bug where it won’t turn on strict conformance mode (to eliminate CSS bugs) if the <?xml…?> stanza exists.

The solutions seem to be:

  1. Eliminate all extended ASCII from output, and replace it with character references (such as &eacute;). This is obviously evil.
  2. Disable Apache’s Content-Type HTTP header crap. This is evil: see above.
  3. Forget about the output validation stage. Evil.
  4. Generate ISO-8859-1 with the XML stanza and add an extra stage after output validation that strips off the <?xml…?> stanza. Evil.
  5. Try to find a way to set the content-type of files so Apache sends the proper content-type in the HTTP headers. If done with .htaccess files, it will be a big PITA.
  6. Eliminate non-7bit ASCII altogether. Ugh.

What a mess.

Overriding DTD location in xmllint

XML No Comments »

Aha! I found out how to override the location of DTDs using xmllint, part of GNOME’s libxml: use catalogs. Of course, it would be nice not to have to specify it via an environment variable, but whatever.

XDocs (InfoPath)

Apache, HTML, XML No Comments »

I keep seeing XDocs (a.k.a. InfoPath) popping up in peoples’ blogs, such as here and here. Maybe, just maybe, the tiniest bit of code that I wrote for XDocs way-back-when will make it through to the shipped version. *sigh* It sure would have been nice to get a ship award.

In other news, I would like to start a jihad against exposing the extension of (most) files on websites. That’s what MIME types and directories with default files are for! For example, DON’T link to http://www.deez.info/sengelha/index.html, link to http://www.deez.info/sengelha/. That way if I suddenly decide to rewrite my website using PHP everyone’s links still work. It will even work if I change everything to client-side XML/XSLT transformations, which will result in a change in the MIME type of the document.

If you don’t want to make one directory per file, then create an extensionless filename and use the following snippet in your .htaccess file to force the MIME type of the file to a particular value (for Apache):

<Files rss091>
ForceType application/rss+xml
</Files>

<Files rss1>
ForceType application/rss+xml
</Files>

The main downside to the above is that some URLs end with slashes and some do not. I’ll think about this one and get back to you.

XML Usage

XML No Comments »

When I use XML, I tend to find myself inventing my own DTD which specifically suits my purposes, and then (when desirable) I use an XSLT transform to produce the ’standard’ format. This is usually because the standard does not support a set of features which I desire. The downside to this is obvious.

However, many DTDs (such as this one) provide extensibility through parameter entries, something I’m still not 100% clear on. Also, I may be able to use namespaces for this as well — I need to read more about this. Hmm, I wonder, are namespaced nodes not considered in the validation of outer elements?

Also, I wish xsltproc supported the Xalan-like feature of allowing the ability to override the location of DTDs by public identifier.

Aha! A reason why my mail may not be getting through to AOL!

Email, Mutt No Comments »

I use mutt as my e-mail reader, with a patch I wrote that makes mutt use libESMTP to send mail. This ensures that my e-mail goes through Yahoo’s e-mail servers, the proper behavior for my Yahoo account. However, it seems that there is a bug somewhere, as my e-mail’s envelope From address is being set to MAILER-DAEMON. I’m not sure if it is doing this for all e-mails, or on all e-mail servers, but I wouldn’t be surprised if this is why my e-mail to AOL addresses isn’t getting through.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in