0xDECAFBAD

It's all spinning wheels and self-doubt until the first pot of coffee.

Using web services and XSLT to scrape RSS from HTML



After tinkering a bit with web services and XSLT-based scraping last week for generating RSS from HTML, I ripped out some work I was doing for a Java-based scraper I'd started working on last year and threw together a kit of XSLT files that does most everything I was trying to do.

I'm calling this kit XslScraper, and there's further blurbage and download links avaiable in the Wiki. Check it out. I've got shell scripts to run the stuff from as a cron job, and CGI scripts to run it all from web services.

For quick gratification, check out these feeds:

shortname=xsl_scraper

Archived Comments