0xDECAFBAD

It's all spinning wheels and self-doubt until the first pot of coffee.

The right place for data in your feed

I've suddenly gotten very interested in microformats, especially since it struck me very soundly that they belong in this final chapter of my book about extending feeds.

I started off writing a bit about mod_event and RVW with respect to extending feed formats with additional metadata, but then the microformat thing hit me: Why extend the feed format metadata when you can extend the feed content? This seems so obvious to me that it's either got to have been covered by someone else already, or it's so wrong that I just haven't seen it yet.

For one thing, extending the format limits your extension to that format, unless you craft it in such a way that makes it compatible with other feed formats. This is not impossible or even hard in some cases, but it's more of a problem of adoption and buy-in than a technical issue. mod_event is an RSS 1.0 extension; how many RSS 2.0 or Atom 0.3 feeds have you seen using some adaptation of mod_event? (Though, to be fair, how many RSS 1.0 feeds have you seen using mod_event in the first place?)

Just what do calendar events have to do with syndication feeds, other than that feeds are a convenient carrier wave? Why should we try to shoehorn calendar data into the fabric of a feed itself?

All the feed formats I care about can carry (X)HTML content, and these microformats are just XHTML content constructed along a certain set of conventions. So, any feed supporting XHTML content can carry microformat-enriched content--without, necessarily, any feed format alterations, convincing of feed format authors, or buy-in from aggregator developers. (Of course, you might need those developers to actually do something with the microcontent when it arrives, but that's another story.)

Leave the feed to manage the business of facilitating push-via-poll with aggregators and decouple the markup and structure of rich microcontent items from feed formats altogether.

Furthermore, this helps address the question of "Is a feed the right place for your data?" My tentative answer to this is, "No, but it doesn't hurt if it's in there, too."

Think of the feed as just a mechanism to facilitate content delivery--and not as the embodiment of the content itself. Mind you, feed entries do make nice representations of content, covering many common attributes. But, keep feeds themselves constructed in terms of metadata about content. Although items of content might also be carried in full in feed entries, the "real" content should exist somewhere else as well.

Things like mod_event should be used to provide hints to aggregators whose authors, for instance, might not care to bundle in and maintain a full-blown microformat parser.

Think of a MySQL table: You can define columns in a table, and then you can define zero or more indexes on those columns. The data's still there if you don't define any indexes, and queries are still possible--it's just that queries are more laborious. And, your indexes don't need to be matched up one-to-one with your columns--they can be concatenations of columns, or just about whatever else you want.

Consider mod_event like an index for aggregators on hCalendar content--hCalendar defines the columns, mod_event facilitates better routing/filtering. Although getting buy-in from feed format authors and feed aggregator developers helps get indexes created, you don't necessarily need that to get the content into the feed in the first place.

Okay... I think all that made sense. I had to get that spewed out of my head before I lost it. :)

Archived Comments

  • I'm a fan of microformats, but this seems so wrong. microformats are a "worse is better" solution for getting rich data onto the web for people without better tools then a simple CMS. their useful to large scale aggregators (e.g. pubsub or technorati) who are already maintaining crawlers. (though I think the really promise doesn't lie with generalist, but domain specific aggregators developed by small affinity groups) If you do have good enough tools to provide the data in a more structured way then by all means *do* it, more elegant, easier to consume, more meaningful, less ambiguous, easier to grok, etc. mod_event is *much* more useful then hCalendar in the context of feeds, but perhaps its non-obvious because you're thinking of it from the context of S-to-P (site to person) syndication, but where it comes in very useful is S-to-S (site to site) syndication, the original RSS design goal. additionally the joy of XML means you can mix in other namespaces to enrich mod_event's intentionally limited expressiveness. in particular we've played with mixing in vCard and geo namespaces to get more specific as for your last question, i've seen several hundred feeds with mod_event, but i'm kind of odd like that :)
  • Exactly -- from the perspective of trying to maximize usefulness, it's great to put your "semantic markup" in both the feed and the data. It depends on your target market, of course... if nobody will ever scrape your site but you have an aggregator of calendar events watching your feed like a hawk, then the feed is most important. But if you're just kicking off a new market, you might as well do it in a way that makes the most sense for the future...which is what this seems to be.
  • You've made an excellent observation. Does one have to request permission from the envelope makers before one writes a letter in a different way? Does it make sense to ask everyone to write envelopes differently just because you've figured out a new way to write letters or new things to put in your letters? Does it make sense to feel obligated to duplicate the information in your letter on the envelope as well? Of course not. In many ways, feeds are nothing more than another medium for passing messages, and just as there was no need to change nor extend TCP/IP to handle HTTP or HTML for that matter, there is no need to change RSS/Atom to handle new types of content which are well described by portable microformats. Tantek P.S. I can't believe anyone used the word "joy" and "namespaces" in the same sentence (other than to make that observation). :)