The blogs in the Goog are categorised under a general category relevant to the dominant nature of posts in a particular blog. For instance “Daemonite” is about MX technologies including Flash, Remoting, Comms Server etc—but predominantly we post about CFMX—hence it is a CFMX blog.
Given the recent traffic in off topic posts from various blogs concerning the “Iraqi Conflict” we have received lots of complaints about the aggregated content. The overwhelming majority of people want to review MX technology related posts on the Goog and are happy to get their political content at the time of their choosing. This made me redouble my efforts to try and categorise inbound feeds at the level of the blog item, and give users that option.
Blog item categorisation would have all kinds of interesting advantages, not least the ability to get rid of off-topic posts from the aggregation. It’s all about improving the signal to noise ratio—saving people precious time. As the content volume on any aggregation increases there is a constant trade-off between the breadth of the coverage and the time taken to find anything your interested in. The Goog currently serves a blended feed of over 80+ blogs and climbing—and so digital librarianship is something I’m concerned about.
Examining the problem reveals several underlying information architecture issues that need to be addressed first:
- the development of a lexicon of Macromedia technology topics
- encouraging blog authors to include classfication in their blog feeds
- mapping blog author classifications to the established Goog lexicon
MX Lexicon
Having discussed the matter with members of the Macromedia WTG, it is clear that Macromedia themselves have no established lexicon of terms. The “mothership” classifies things with a great deal of human editorial influence—something that clearly won’t work with the number of posts I aggregate daily.
Blog Authors
I’ve managed through a series of occult sacrifices and constant needling to encourage many Goog-aggregated blog authors to move to more modern RSS feeds with category elements. But this is by no means universal. Oh, and that doesn’t necessarily mean blog authors will actually apply categories to their posts :)
Category Maps
Once you have a feed with categories there is little hope that the blog authors have used the categories that you outline in a published lexicon of terms. Nor is it reasonable to expect blog authors to have to follow such a regime. So the task of mapping and maintaining 80+ blogs category listings seems a bit too daunting at this time!
I have a few more experimental ideas I’m kicking around. But for the moment, blog item level classification is being put on the back-burner.
Posted by modius at 07:25 PM | Permalink
Trackback: http://blog.daemon.com.au/cgi-bin/dmblog/mt-tb.cgi/107


Another aggregator java.blogs (http://www.javablogs.com/) had mentioned the possibility of creating a Bayesian filter to filter out off topic items.
Posted by: Pete Freitag on April 20, 2003 06:58 AM
A shared conceptualisation of categories is a tough nut to crack. There are extensive links to others who've put thinking time into this at http://IAwiki.net/SharedCategories
Posted by: Eric Scheid on April 29, 2003 03:15 PM