FeedBurner feeds give heartburn to PHP XML parsers?
Final Update: Actually, it turns out that feeding raw, uncompressed gzip streams to PHP XML parsers causes heartburn. Go figure. I guess that teaches me to poke the lazyweb with a stick.
I'm trying to chase this bug down, but is it just me or do FeedBurner feeds give trouble to PHP XML parsers? (No, it's just me.) These three scripts
break do not break:
- FeedBurner feed - http://feeds.feedburner.com/AVc
- JSON via Magpie parsing — http://decafbad.com/2005/12/FeedMagick/www-bin/as-json.php?in=http%3A%2F%2Ffeeds.feedburner.com%2FAVc
- Passthough with SAX parsing — http://decafbad.com/2005/12/FeedMagick/www-bin/passthru-sax.php?in=http%3A%2F%2Ffeeds.feedburner.com%2FAVc
- Passthough with DOM parsing — http://decafbad.com/2005/12/FeedMagick/www-bin/passthru-dom.php?in=http%3A%2F%2Ffeeds.feedburner.com%2FAVc
I'm willing to blame my own ineptitude, except that the error in MagpieRSS parsing happens in the bowels of that beast, right where the XML parsing happens... and that's not my code.
Update: Actually, I think it's a different problem, but just one that FeedBurner feeds all happen to trigger. Maybe something to do with the initial tag separated from the XML declaration by some whitespace?
Update: By the by, I don't really think MagpieRSS is a beast. Although, I do feel like I'm in the belly of one when I'm wandering through PHP code in general. And the "not my code" bit is mostly me trying to figure out what's breaking where and why—since Magpie works in other cases except where I'm abusing it. More like head-scratching italics, not finger-pointing italics. Seems like my code's breaking it, but it's not breaking in my code.
Among the things to twiddle would be the user-agent string you're sending, since the original point of Feedburner was to decide what was best for various UAs and vary what's sent accordingly. For what little it's worth, it works for me in Gregarius (which is some 0.7x MagpieRSS, though I'm not sure quite which).
there are 2 different PHP XML parsers (expat based and libxml based) depending on which version of PHP you're using, and a number of minor inconsistencies in the various implementations between versions using the same parser. without knowing which PHP its impossible to speculate.
though Phil's suggestion seems like a good place to start.
Kellan: Eek, the "that's not my code" bit wasn't meant as any slight against MagpieRSS... nor was calling it a beast. Though after a night's sleep, it sure looks that way!
Would you mind giving me more details as to what you think is giving the parser fits? I'm pretty sure the XML we send down is valid, but we've had to put workarounds in place for other user-agents. Right now, the only thing we have special for Magpie RSS is that we don't serve Atom to that user-agent if the version is 0.5.
If there's something we can fix on our side that would reduce breakage, I'm all for it.
Eric Lunt CTO, FeedBurner
I probably should have done some more digging before posting, but I just ran into this problem and figured I'd throw up a flag hoping someone on the lazyweb knew just what the problem is.
I'm going to be poking into it more as I have time, though at the moment I don't even know enough to say for sure that FeedBurner's the culprit--just that every FeedBurner-processed feed triggers my bug. Since no one else seems to know what the issue is right off the bat, my current thought is that it's something I've fumbled. :)
Eric: Thanks for the fast response! It makes me think even more that the issue's lurking in my own code, rather than in FeedBurner, since I figure you guys would have fixed an overall issue big enough to affect all PHP XML parsers :)
I wrote a simple little PHP 5.1-based RSS parser a while back and it doesn't have any problems with that http://feeds.feedburner.com/AVc feed.
You can see the parser code here:
And if you want to see how test_rss.php uses it, see:
Man, I got a response from key people across the board on this—author of MagpieRSS, CTO of FeedBurner, and the PHP Man himself! I am now thoroughly convinced that the problem firmly lies within whatever odd things I'm doing in my code. :) Now I just have to get the spare time to figure out what's up.
Thanks for the responses, guys! I couldn't pay for this kind of support.
Funny that I just encountered this problem, too. I discovered that the problem was that because my PHP script was using cURL to fetch a feed that was REDIRECTING to FeedBurner, cURL was crapping out unless I set a cURL option to follow redirects (where fopen would follow the redirect by default).
After I did that, I had no problem.