An end to my referrer abuse
Amen. I’ve always found it irritating that news aggregators insert their URL into the referrer field. ... It would be nice if there was some sort of browser header the aggregator could send to identify itself instead of using the referrer field. Oh, that’s right, there is. It’s called User-Agent.The user agent field is designed for browsers, robots, and other user agents to identify themselves to the Web server. You can even add additional information, like a contact URL or email address. I’d like to see aggregators start using it.
Hmm, being mostly a standards neophyte, I thought this was a great idea, you know, NeatLikeDigitalWatches. I thought this was more a semi-clever overloading of the referer, rather than outright abuse. And this, I thought, was reasonably okay since there wasn't, I thought, anywhere else to stick a backlink to myself while consuming RSS feeds.
Well, yeah, now that I read some of the complaints against this use of referers, I agree. And, yes, now that I read the fine RFC, I see that the User-Agent string is more appropriate for this purpose.
So! From now on, hits from my copy of AmphetaDesk will leave behind a User-Agent string similar to this:
"AmphetaDesk/0.93 (darwin; http: //www.disobey.com/amphetadesk/; http: //www.decafbad.com/thanks-for-feeding-me.phtml)"
I tack my own personal thanks URL onto the end of the list within the parenthesis. In addition, I no longer send a referrer string when I download RSS feeds. How did I do it? Very simply.
First, I modify my AmphetaDesk/data/mySettings.xml
file by hand to supply a blank referer and a new user URL (having some angle-bracket problems, bear with me):
[user]
...
[http_referer][/http_referer]
[user_url]http://www.decafbad.com/thanks-for-feeding-me.phtml[/user_url]
...
[/user]
Second, I modified AmphetaDesk/lib/AmphetaDesk/Settings.pm
to account for the new setting:...
$SETTINGS{user_http_referer} = "http://www.disobey.com/amphetadesk/";
$SETTINGS{user_user_url} = "http://www.disobey.com/amphetadesk/";
$SETTINGS{user_link_target} = "_blank";
...
Third, I modified the create_ua()
subroutine in AmphetaDesk/lib/AmphetaDesk/WWW.pm
to actually use the new setting:
sub create_ua {
...
my $ua = new LWP::UserAgent; $ua->env_proxy();
$ua->timeout(get_setting("user_request_timeout"));
my ($app_v, $app_u, $app_o, $user_u) = (get_setting("app_version"),
get_setting("app_url"), get_setting("app_os"), get_setting("user_user_url"));
$ua->agent("AmphetaDesk/$app_v ($app_o; $app_u; $user_u)");
...
}
And voila - no more referer abuse. If you want to discover my thank-you message, examine the User-Agent string. Seems like this would be a good idea for all news aggregators to pick up. And if I get ambitious and have spare time today, I'll be sending off a patch to Morbus & friends later today.
Update: Gagh! This has been the hardest post to try to format correctly within the fancy schmancy auto-formatting widgets I have piped together. All apologies for content resembling garbage. I think I'll use this excuse in the future whenever I write something completely daft. (Which means I'll be using it a lot, most likely.)
shortname=ooodoe
Archived Comments