« August Book Reviews | Main | New Search Methodology »

August 2, 2005

RSS: The Future of the Metadata Repository

Really Simple Syndication (RSS) is a lightweight XML format designed for sharing headlines and other Web content. Think of it as a distributable Whats New for your site. Originated by UserLand in 1997 and subsequently used by Netscape to fill channels for Netcenter, RSS has evolved into a popular means of sharing content between sites (including the BBC, CNET, CNN, Disney, Forbes, Motley Fool, Wired, Red Herring, Salon, Slashdot, ZDNet, and more). RSS solves myriad problems webmasters commonly face, such as increasing traffic, and gathering and distributing news. RSS can also be the basis for additional content distribution services (Web Reference, 2005). The typical use of the RSS feed is within the WebLog (blog) environment. Once the author updates their blog with an entry, the system will update the RSS file and send a 'ping' message to the 'Aggregation Ping Server' indicating that his site has updated. Several organizations like Feedster and Technorati will monitor the feeds and publish in a centralized location the content. The other option is that end users can simply purchase or download a news aggregator application (reader) which allows the user to subscribe to any blog that supports the RDF/XML feed. The application can check the blog for updates once an hour or once a day depending on the configuration of the reader. This eliminates the need to visit search engines or news collection sites in order to read the content of the blog.
The implications for the metadata environment are enormous. Taking a closer look at the RSS standard reveals that the standard is fairly simply and consistent irregardless of the context. This indicates that a simple meta-model such as the Dublin Core could be easily exchanged by the use of RSS technology and that news readers could replace the majority of the functionality within the centralized metadata repository. The section will argue that publishing new content is very similar to the information required for publishing technology asset metadata or will be in the future. Advancements in the RSS technology will allow development, modeling, and other system development lifecycle products to publish information about their assets which will eliminate the extraction of information by hand or forcing integration into a single methodology. RSS already has search functionality and personal taxonomies where the end user can catalog their own content which may prove to be much more valuable than the traditional IT based taxonomies and onotologies. Assuming that vendor organizations can convert their product lines to the XML based standards then a whole new world will open up to the possibilities of the semantic web-enabled applications. W3C (2001) defines the semantic web as:

The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. The mix of content on the web has been shifting from exclusively human-oriented content to more and more data content. The Semantic Web brings to the web the idea of having data defined and linked in a way that it can be used for more effective discovery, automation, integration, and reuse across various applications. For the web to reach its full potential, it must evolve into a Semantic Web, providing a universally accessible platform that allows data to be shared and processed by automated tools as well as by people.

By definition the semantic web will integrate the different technologies like XML, RDF, RSS, namespaces, ontologies, etc. These technologies will come together to radically change the way in which we collect information. McComb (2004) defines the killer application in the semantic web as a radical improvement over search and agent. An agent is a program to which an individual delegates some authority to act on the individual’s behalf and then releases to act autonomously. Clearly, RSS technology moves us in that direction. More importantly, inside the corporation the semantic environment can be controlled and dictated which might be impossible on the World Wide Web (WWW). The semantic web and accompanying technologies will produce an environment where a universal repository is possible and should be on the market.

Posted by Todd at August 2, 2005 5:55 PM

Comments

Be sure to take a look a Atom, inspired by RSS, soon to be an IETF standard and much better specified than RSS 2.0.

http://www.atomenabled.org/developers/syndication/

Also the Atom protocol for posting entries is in the standards process and is much stronger than the protocol associated with RSS.

Posted by: Rick Thomas at August 2, 2005 7:33 PM

A more technical Atom article http://www-128.ibm.com/developerworks/xml/library/x-atom10.html

Posted by: Rick Thomas at August 3, 2005 4:05 PM

I recently saw a presentation on EPA's Environmental Information Management System http://www.epa.gov/eims/, which is using RSS to enable federated search and to share its metadata with other repositories. The presentation is here: http://www.aaaw2005.com/FileDisplay.cfm?FileID=779

Posted by: James Melzer at August 5, 2005 4:23 AM

This is perfect, thanks for the slides.

Posted by: RTodd at August 5, 2005 11:55 AM

Copyright © 2002 - 2005 - R. Todd Stephens, Ph.D. All rights reserved.