It's all spinning wheels and self-doubt until the first pot of coffee.

Linkbacks, robots, laziness, the semantic web, and you.

I just noticed Ghosts of Xanadu published on Disenchanted, where they make an analysis of the linkback meme and it's historic roots. They cover pretty much all the big ideas I've been poking at in my head, and give props to Xanadu. Heck, they even mention Godel, Escher, & Bach (which my girlfriend & I have started reading again) and ant scent trails.

So along with the ?JavaScript-powered linkback thing, something else I've been thinking about is a little semantic sugar to add to the mix. I keep forgetting to mention it, but what makes Disenchanted's linkback system very good is that Disenchanted "personally visits all pages that point to us and may write a short note that will accompany the returning link." They manually visit and annotate their links back, whereas my site just trundles along publishing blind links.

I'd like to change that with my site. The first thing I'll probably do is set up some triggers to track new referring links to my pages, and maybe give me an interface to queue them up, browse them, visit them, and annotate them.

But the second thing is something that would require a little group participation from out there in blogspace. It might not work. Then again, it might catch on like crazy. I want to investigate links back automatically, and generate annotations. I'm lazy and don't want to visit everyone linking to me, which sounds rude, but I think that the best improvements to blogspace come with automation. (In reality, I do tend to obsessively explore the links that show up in my referral log, but bear with me.)

I can respect the manual effort Disenchanted goes through, but I don't wanna. So, I want a robot to travel back up referring links. What will it find there? Well, at present, probably just an HTML page. Likely a weblog entry, maybe a wiki page. What can I reasonably expect to derive from that page? Maybe a title, possibly an author if I inform the robot a bit about what to look for. (ie. some simple scraping rules for blogs I know regularly link to me.)

What else can I scrape?

Well, if bloggers (or blog software authors) out there help me a bit, I might be able to scrape a whole lot. I just stuck a Wander-Lust button on my weblog, and I read about their blog syndication service. You can throw in specially constructed HTML comments that their robot can scrape to automatically slurp and syndicate some of your content. Not a new idea, but it reminds me.

So bloggers could have their software leave some semantic milk & cookies out for my robot when it wanders back up their referring links. Maybe it could be in a crude HTML comment format.

Or maybe it could be in a bit of embedded RDF. Hmm. Anyone?

What would be useful to go in there? I might like to know a unique URL for the page I'm looking at, versus having many links back to the same blog entry (on the front page, in archives, as an individual page with comments, etc.) I might also like to know who you are, where you're coming from, and maybe (just maybe) a little blurb about why you just linked to me. I'd like to publish all these things along with my link back to you, in order to describe the nature of the link and record the structure we're forming.

This seems like another idea blogs could push along, semantic web tech as applied to two-way links.

Of course, the important thing here is laziness. I'm lazy and want to investigate your link to me with a robot. But you're lazy too. There's no way that you'll want to do more work that I do to provide me with the data for my robot to harvest. So... how to make this as easy as making a link to me is now-- or better yet, can we make it easier to make a richly described link? That would really set some fires.


Archived Comments

  • Scraping information out of comments just seems wrong. This could actually be one of the first practical (and maybe even successful) use cases for RDF.
  • My thoughts precisely!
  • You don't even need RDF, you could put the same data into metatags with DublinCore names.
  • I have recently fallen in love with annotea for this exact reason. I've created an annotation gateway to my weblog so that if you have an annotea client, you can subscribe to my weblog with it and see my entries as you surf the URLs they apply to. see: http://ncyoung.com/permaLink/106 or my URL above, search weblog for annotea.