0xDECAFBAD

It's all spinning wheels and self-doubt until the first pot of coffee.

Microsummaries and Content-Type Mysteries

One of my favorite features of the new Firefox are Microsummaries. They're like RSS-lite: one-liner summaries of web pages that can be used to keep bookmark titles up to date and get succinct info about a page. For example, you could get the latest temperature from the title of a weather report bookmark, or the most recent bid price from an auction page.

But, there's one thing that rubs me the wrong way, and it's this:

<link rel="microsummary" href="microsummary.txt">

That's how you clue a client into finding the microsummary for a given HTML page. Stick that in your head tag, and you're off. It's just like RSS autodiscovery. Well it is, except for an important detail: What's the Content-Type I should expect at that URL?

You see, microsummaries can be provided as either a direct URL to a plain text summary of the page, or a URL to an XML-based generator providing the means to extract that plain text summary from the page. But, as spec'ed, you never know which you're going to get until you fetch the URL. So, when trying to handle a microsummary, I never know whether to just use the fetched content directly as plain text or whether it's time to fire up the XSL machinery.

I've seen some sites toss in a type="text/plain" or type="application/xml" attribute - which is very helpful and what I really want to see - but it's not in the spec. From a cursory perusal of Firefox source code, it looks like the browser tries to sniff the Content-Type header returned by the web server - but that sucks, because web servers often lie or are confused about Content-Type. I need to read more into that source code, so I can at least do as well as Firefox does.

Eh, it's a small gripe, but one on which I've spent too much time already.

Archived Comments

  • What's the advantage of sticking it into the markup instead of just using the HTTP headers for the microsummary?

  • Well, in the markup, I'm more confident it'll tell the intended truth. That's easier to change many times than the server config itself to serve up the correct HTTP headers. I know this is a shady argument - but so far in my small sample set of 8 or so sites, a third of them fed me the wrong Content-Type. That is, "text/HTML" when it was plain text, "text/HTML" when it was an XML generator. So, I've had to come up with some smart guesses.

    One example, Markdown Monkey. The site header includes a type="application/x.microsummary+xml" for href="http://www.markdownmonkey.com/microsummary.asp", yet the server feeds me a Content-Type: text/html header.

  • And, of course, there's always this old horror story:

    http://www.xml.com/pub/a/2004/07/21/dive.html

  • Indeed, Firefox ignores the type attribute to the tag. It would be great if that attribute wasn't necessary, because web servers always served the correct type, but that's clearly not the case, and allowing pages to specify the content type of their microsummaries seems likely to be as useful for microsummaries as it is for feeds.

    I have filed bug 358977 on the issue.

  • Myk: Fair enough. Excellent technology, by the way. :)