dbagg3, an Atom-powered client/server news aggregator toolkit ============================================================= $Header: /cvsroot/dbagg3/README,v 1.6 2004/08/12 22:55:18 deusx Exp $ Copyright 2004 by l.m.orchard This software is licensed under the same terms as Python itself. * * * INTRODUCTION --------------------------------------------------------------------------- Welcome to my third experimental feed aggregator in Python. Be warned that things are in an interesting, yet barely functional, state. The goal of this little kit is to make it easy to get an aggregator up and running in order to spend time tinkering with new ideas in GUI, presenting information, aggregation patterns, and other such things. INSTALLATION --------------------------------------------------------------------------- You're going to need the following prerequisites: * Python 2.3 (or higher) * MySQL or SQLite * MySQLdb - http://sourceforge.net/projects/mysql-python * PySQLite - http://pysqlite.sourceforge.net/ * SQLObject - http://sqlobject.org/ * Currently, a patch to support SELECT DISTINCT is required: http://sourceforge.net/mailarchive/message.php?msg_id=9122066 * libxml2, libxslt, and Python bindings - http://xmlsoft.org/XSLT/python.html You can use either MySQL or SQLite. If you are using MySQL, Create a new database and user. For example: $ mysqladmin -uroot -p create feedreactor $ mysql -uroot -p -e'GRANT ALL PRIVILEGES ON feedreactor.* \ TO feedreactor@localhost IDENTIFIED BY "somepass"' Then, update conf/dbagg3.conf with the database details: [data] driver = mysql [data_mysql] host = localhost db = feedreactor user = feedreactor passwd = somepass Alternately, if you are using SQLite, modify dbagg3.conf like so: [data] driver = sqlite [data_sqlite] file = data/feedreactor.db After that, you can initialize the database tables used. For MySQL, there is a ready-made SQL dump available, which can be used like so: $ mysql -ufeedreactor -p feedreactor < docs/sql/mysql.sql However, there is no dump for SQLite, and the MySQL dump may be occasionally out of sync with bleeding-edge code. In either of these cases, there is a method to initialize the database directly from the data model classes: $ ./bin/dbagg3 init This should result in something like the following: Creating database tables... Entry Feed Login Preference ScanHistory Subscription SubscriptionCategory SubscriptionEntryNote SubscriptionNote User Adding 'default' and 'admin' users. USAGE --------------------------------------------------------------------------- Next, you'll want to import a list of feeds. You can do this in one of two ways: 1. Use an OPML file exported from another aggregator: `$ ./bin/dbagg3 subsopmlimport mySubscriptions.opml` 2. Create a text file listing feed URLs, one per line. `$ ./bin/dbagg3 subsimport feeds.txt` This may take a little while, as each feed is scanned and entries are processed for the first time. If you have individual feeds that you'd like to add, you can do this like so: $ ./bin/dbagg3 subsadd http://www.decafbad.com/blog/atom.xml Later, you might want to get your list of subscribed feeds back out: $ ./bin/dbagg3 subsopmlexport mySubscriptions.opml $ ./bin/dbagg3 subsexport feeds.txt After loading up some feeds, you'll probably want to schedule regular feed update scans. You can perform update scans with the following command: $ ./bin/dbagg3 scanupdate And if you want to schedule it in your crontab, something like this would work: 33 * * * * (cd $HOME/Development/dbagg3; nice -n 19 ./bin/hourly.sh) Note that although the above schedule runs an update scan every hour, this does not mean that all feeds will be fetched. The period between scans of any individual feed varies according to whether there were new items found on the previous scan. In short, this command just fires up a scheduler, and doesn't necessarily hammer all of your feeds. However, if for some reason you *do* want to scan all of your feeds, this command does what you want: $ ./bin/dbagg3 scanall At the moment, the web interface is in a bit of flux. However, you can generate an HTML dump of new items over the last 12 hours with the following: $ ./bin/dbagg3 gennew htdocs/news.html That's it for now. Watch this space for future updates! Happy Hacking! Share and Enjoy!