0xDECAFBAD

It's all spinning wheels and self-doubt until the first pot of coffee.

On RSS and Namespaces

Dave writes on "RSS and Namespaces":

... there are some XML parsers that don't properly deal with namespace attributes on the top-level element of a source.
Agreed. These parsers are often cheaper to deal with when you know that the format you're expecting doesn't involve namespaces. You trade some flexibility for some ease of development.
For these guys, just introducing an xmlns attribute is enough to make them reject the feed. So while they could handle a 0.92 feed, as soon as we introduced the xmlns attribute, they gave up.
Yes, because they weren't expecting to be fed something with namespaces, since they'd been designed around v0.92 and family, and had been fed v2.0 with the expectation that it was 100% backward-compatible.
...Presumably RSS 1.0 doesn't have the same problem we tripped over yesterday with RSS 2.0. So I looked at a few RSS 1.0 feeds, and guess what, they do the same thing we were doing with the 2.0 feeds. ... I conclude that the same broken parsers that didn't like the 2.0 feeds with the xmlns attributes, must also not like the 1.0 feeds.
And your conclusion would likely be correct - because those parsers weren't expecting to consume namespace-using XML, and they shouldn't be expecting RSS v1.0. If an application is designed with RSS 1.0 in mind, then the author should be using a namespace-aware parser and correctly handle the namespaces, since that's the nature of the beast. To neglect or mishandle namespaces in consuming RSS 1.0 is a mistake.



Admittedly, some applications which apparently consume RSS v1.0 feeds correctly may be broken in this way - this is not unique to RSS v2.0. If they're broken, they need fixing. But that's another story...



So, on to the conclusion:

If this is true, we can't design using namespaces until:



  1. All the parsers are fixed, or


  1. Users/content providers expect and accept this kind of breakage (I don't want to be the one delivering that bit of bad news, got burned not only by the users, but by developers too, people generally don't know about this problem, or if they do know are not being responsible with the info).


Anyway it looks to me like there's a big problem in the strategy of formats that intend to organize around namespaces.

Well, of course, end users should not expect breakage. This is obvious to me. No one really wants that.



The big problem I see in the strategy, though, is this: RSS 2.0 claims to be backward-compatible with the 0.9x family, but the addition of namespaces in XML is enough of a fundamental change to break this. I think what Shelly wrote in RSS-DEV is correct: "Namespace support is NOT a trivial change, and will break several technologies, including PHP if namespace support isn't compiled in. This isn't something that can be hacked out."



When I originally read about the emergence of something called RSS 2.0, I said "Go man, go!" But I also said, "What's the catch?" Well, this appears to be a catch. But I think it can be worked through. This is not a fundamental problem with namespaces themselves. This is a versioning problem, and a problem with anticipating all the implications the new version brings to the table. This goes for RSS 2.0, as well as RSS 1.0.



The first thing is to nail a few things down about version numbers and reverse-compatibility. It's been my experience that, when some thing experiences an increment to its major version number, reverse-compatibility is not guaranteed. So, I would assume that from a v0.94 to a v2.0, things are sufficiently different that using it would require that, indeed, "All the parsers are fixed" to support the new major version. So for the most part, v2.0 follows the v0.94 tradition faithfully, but on this issue it parts ways - and yes, potential consumers of v2.0 feeds will need to adjust from their v0.94 code. Thems the breaks, I've been told, when it comes to major version upgrades.



So, again, I don't think that this is a fundamental flaw with RSS 1.0, RSS 2.0, or namespaces. This is an issue of versioning, understanding the technology's implications, and reverse-compatibility.

shortname=ooobii

Archived Comments

  • Very interesting and thanks for the thoughtful comments. So the problem could be nailed by adding a comment to the 2.0 spec saying something like this (not an attempt at spec text): RSS 2.0 feeds are compatible with 0.91 and 0.92 in that every 0.91 or 0.92 source is also a 2.0 source (modulo the version attribute on the rss element), with the exception that, if it uses namespaces, it may not work with some aggregators that do not correctly parse sources that use namespaces. As of Fall 2002, there are enough incompatible parsers to make this advisory necessary. If you wish to be fully backward compatible, do not use namespaces in your feeds. Does this basically sum up your post?? If so, I think that serves as enough of a red flag for non-namespace aware parsers that some feeds at some time in the future will break (maybe now) and that they should test with 2.0 feeds that do use the namespace compatibility, the sooner the better. Dave
  • The problem is nothing to do with parsers. Its because the introduction of a toplevel default namespace produces a completely different XML document format which CANNOT be backwards compatible.
  • Dave: I think you summed up the post well, thanks! :)
  • Charles, very very interesting!