Tangled in the Threads

Jon Udell, May 22, 2002

Personal RSS aggregators

Not long ago, RSS (Rich Site Summary) aggregators were large centralized services: My Netscape, My UserLand, Meerkat. Today, RSS news plays a diminished role at My Netscape. The My UserLand service is decommissioned. Meerkat remains a popular feature of the O'Reilly Network, enabling techies to read and search the standard sources on topics like Linux, XML, and security. But there have been just seven new channels added to Meerkat so far this year. Clearly it's not keeping up with the explosion of activity in blogspace.

Has RSS run out of steam? Quite the opposite. There's more action than ever, but it's shifted into a decentralized mode. That's just what the RSS network needed to do in order to truly operate at Internet scale.

I've been running Radio UserLand for about four months now, and the built-in RSS aggregator has changed how I process information in a dramatic way. At first, my list of channels was biased towards "official" sources -- that is, the RSS channels published by newspapers, magazines, and websites in the areas of interest to me. Increasingly, though, I now rely on other people who are active in these areas, and who are publishing weblogs to which I can subscribe by way of RSS.

There is a deep principle at work here. The relevance engine that powers the emerging RSS network is, very much like Google's relevance engine, decentralized and ultimately social in nature. The links that Google counts are, as Cory Doctorow has said in a beautiful essay, "made by human beings, doing what they do best, link by link, drip by drip." Similarly the raw output of the online news collective is filtered for me by people doing what they do best: spotting patterns, alerting the tribe.

Years ago I attended a seminar on advanced text processing at which Reuters demonstrated a state-of-the-art news filtering system. Built on a Symbolics Lisp machine, it used natural-language-understanding software to read through Reuters' feeds, which then amounted to some 7000 stories per day, and sort them into bins -- a classification service for which customers paid handsomely. Heuristics in the software enabled it to figure out that a story that mentioned "gold" should land in the precious metals bin -- unless, that is, it mentioned Mr. Gold in which case it probably shouldn't.

That filtering mechanism seemed unlikely to me then, and seems even less so now. If not today, then soon, we'll see weblogs written by people who professionally follow precious metals and every other imaginable field. They'll do so because they have to stay on top of these fields, and because they want to establish themselves as authoritative in them.

No single person will be completely authoritative in any one area, but that won't matter -- in fact, it's better that way. In the interplay among several weblogs, the sum can be greater than the parts. Dave Winer calls this the triangulation effect, and on Scripting News he recently pointed to a great example of it. One weblog mentioned that the Mono project was making progress in its support for ADO.NET. Another weblog acknowledged and amplified. Then still another web log did the same. Dave wrote:

A bunch of Dot-Net developers have weblogs. One of them spotted something happening with the Mono project. Would I have a clue what it meant? No way. I watched the news percolate from one developer to another and they're all ack'ing it. So now I know something is happening. In the past, without such verification, I wouldn't have a clue.

Exactly. And while the people most attuned to this new mode of communication are themselves bloggers, both reading and writing these information flows, there is no requirement that you write a weblog in order to tap into the collective mind. If you simply need to monitor the action, there are a number of personal RSS aggregators. I'd been aware of several of them, and Mike Krus of News Is Free pointed me to a list that included others I hadn't heard of.

Carmen's Headline Viewer

This Windows-based viewer was the first standalone RSS reader I used. As has become typical, it uses a newsreader-like triple-pane layout and encourages you to organize feeds into categories. The software hasn't been updated in a while, though, so its provider lists (that is, preconfigured sets of channels) are out of date, and it won't read newer formats like RSS .92 and 1.0.

AmphetaDesk

AmphetaDesk is an example of something that I thought long ago would be an important model for Internet software. It's written in Perl (compiled to executable form for the convenience of Windows and Mac users), and plays its user interface into a web browser.

AmphetaDesk takes a simple and straightforward approach to serving up your channels. What's appealing to me is that, because it's written in script (as is Radio UserLand's aggregator, by the way), its open to experimentation. Beyond organizing channels into categories (which AmphetaDesk currently doesn't do), there are a range of interesting possibilities to explore at the intersection between people and news flows. In my original review of Radio UserLand, I said that RSS is the protocol of a knowledge network, and that people are the routers. We don't yet know what kinds of software assistance people will need to best perform that routing function. So personal aggregators that are scriptable, and that can quickly evolve, are going to be key players in the ecosystem.

News Is Free

Mike Krus reminded me that the RSS aggregation at News Is Free, though centralized, is nonetheless personal. If you create an account there, you can use the website as a cloud-based personal aggregator. Any of the feeds it collects can be added to custom pages that you define. This arrangement delivers all the usual benefits of cloud-based software: zero footprint, anywhere/anytime access.

You can't subscribe directly to a channel that isn't yet in the service, though you can recommend one for inclusion. But in practice, this may not be a serious limitation. The Google principle tends to assure that sources of information interesting to a number of people are going to show up in News Is Free. (The New Channels meta-feed from News Is Free is a good one to watch, by the way).

As do some other tools in this category, News Is Free supports the Blogger API, which means that you can update a weblog running on Blogger, Manila, Conversant, or another site that supports that interface.

FeedReader

This tidy little Windows app is free, reads all the RSS formats, and pops up an alert window when new items arrive. You manage individual channels and lists of channels (e.g., those from Moreover) using a tree control. Clean and simple.

Aggie

This .NET application is another basic GUI RSS reader. It will be of particular interest to C# programmers, for whom it illustrates the use of both GUI and XML aspects of the .NET framework.

NewzCrawler

This $25 product for Windows does an impressive job of synthesizing three flavors of news flows: RSS items, NNTP newsgroups, and news websites. The foldering system is very complete and powerful. The built-in set of channel lists, in OCS (Open Content Syndication) format, is nicely implemented and easy to extend.

If you're a masochist, you can have the program use Microsoft Agent to read your news aloud. Or, somewhat less invasively, you can have it do a CNN-style headline crawl across the bottom of your screen. These options might seem silly. But think about the excitement that surrounded PointCast when it first appeared. The RSS network is really a decentralized version of that idea. There's no reason personal publishing can't have bells and whistles too, when it wants to.

NewzCrawler can import and export OCS-formatted channel lists. And it can write as well as read news. In the NNTP realm, you can post to newsgroups. In RSS space, you can write items to an RSS channel stored on disk. (A Web destination is offered but not yet implemented.) Separately, there's XML-RPC support for the Blogger API.

Peerkat

The trend toward decentralized blogging and personal RSS aggregation has not caught Meerkat's creator, Rael Dornfest, asleep at the wheel. He's cooked up a collection of Perl- and Python-based tools for use and experimentation. Peerkat, written in Python, is a collaborative tool that aims to turbocharge the cycle of reading, selecting, and republishing weblog items. It's really two programs: server.py, and feeder.py. The server has administrative and user modes. In the administrative mode you view, select, and possibly annotate items drawn from a set of RSS feeds. In the user mode, any visitor to your Peerkat node sees the filtered view that is the result of your efforts.

Peerkat's feeder.py reads RSS using a version of the Orchard module that supports only RSS 1.0 feeds. It wouldn't, therefore, read my Radio UserLand feed, which is RSS .92. But DJ Adams, a Jabber guru who's done some brilliant experimentation involving Radio UserLand and Peerkat, showed me that an upgrade to a newer version of Orchard (0.3.2) extends support to RSS 9.x feeds as well. Worked like a charm.

Like AmphetaDesk and Radio UserLand, Peerkat's application model is that of a script engine which implements a local web server and plays its user interface through that server. The exceptionally crisp and clean Peerkat UI is a reminder that a local web server, coupled with a browser, can produce really useful desktop software.

Closing the loop

The growth of RSS-enabled weblogs makes it increasingly likely that people will find and begin using these personal aggregators. Things have moved a little more slowly than some of us expected. It was only last week, for example, that Blogger Pro announced explicit support for RSS. But things are moving along nicely now. It's annoying to have to worry about the variants of RSS -- the .9x series versus 1.0 -- yet for the most part, tools can hide this mess from users. It's just XML, after all, mechanically transformable from one flavor to another.

We techies tend to think that issues of extensibility and metadata representation are the roadblocks that prevent RSS-enabled communication from reaching its tipping point. I've been as guilty of this as anyone. But recently, in a roomful of wireless bloggers at the O'Reilly Emerging Technologies Conference, it struck me that there's a more basic roadblock. Sam Ruby, in a posting to his weblog, noted that "even experienced bloggers seem to be just discovering" the concept of RSS. The publish/subscribe model is, to be sure, one that even software and networking pros can struggle to wrap their heads around. It's a stretch to apply these same principles of loosely-coupled messaging to the realm of human communication. Lots of people get the idea of publishing blogs to the web. Not so many, as yet, see that publishing an RSS channel, and subscribing to channels, closes the communication loop and creates a new interactive medium. "Aggregation will become part of our everyday information lives in the future," write Jenny Levine at TheShifted Librarian. With the new breed of personal news aggregators, that future is now.


Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He is the author of Practical Internet Groupware, from O'Reilly and Associates. Jon now works as an independent Web/Internet consultant. His recent BYTE.com columns are archived at http://www.byte.com/tangled/.

Creative Commons License
This work is licensed under a Creative Commons License.