PC WEEK: Developing a Card Catalog For the Expansive Web

More Web Proxy on the site http://driver.im/

[Archive copy (text only) mirrored from the URL: http://www.zdnet.com/pcweek/opinion/0825/25isigh.html; see this canonical version of the document.]

August 25, 1997
Intersights
Developing a card catalog for the expansive Web
By Eamonn Sullivan

Making the Web more like a library and less like most bookstores has been a goal of Web researchers for years, but we'll get closer to that goal in the next few months with the development of a tool called Resource Definition Format, or RDF.

The difference between a good library and a bookstore is the card catalog. When looking for a book in a bookstore, you have to make do with the usually simple organization imposed upon the information by the owners. Books are organized in broad categories, such as fiction and nonfiction, history and philosophy, or science and science fiction. If you know exactly what you're looking for, you can ask the salesperson. Otherwise, you have to browse, using the order imposed by the seller as a guide.

The only improvement on this system offered by the Web is the full-text search, which is of limited value. With most search engines, pages on Barney and the Smithsonian's dinosaur pages have equal weight.

In a library, the card catalog (whether electronic or not) gives you a tool for more sophisticated searches, using author, title and subject to narrow down your search. The card catalog is called metadata--information about information. The ability to add metadata to Web content has been available for a long time, using things such as metatags. But the approaches, with the possible exception of the PICS rating system, have been somewhat haphazard.

The emergence of XML in a more or less solid form earlier this year has provided a more comprehensive framework for metadata, prompting several organizations to propose solutions based on XML. The main proposals have been XML-Data from Microsoft (which is available at www.microsoft.com/standards/xml/xmldata.htm) and MCF (Meta Content Format) from Netscape (available at www.w3.org/TR/NOTE-MCF-XML/).

Both proposals provide for a sophisticated method to describe the structure of information, such as properties about authorship and relationships between objects.

This week, a working group under the auspices of the W3C organization will meet in Redmond, Wash., to begin hammering out a specification that will take the best parts of XML-Data, MCF and PICS. The resulting RDF specification, if used widely, will enable more efficient searches and exchanges of information between organizations.

For example, detailed descriptions of information structures and content will enable Web browsers to provide more useful site maps for navigating through large sites and allow for more customization of the way information is displayed.

The key, of course, is widespread use. RDF will likely be open-ended; each publisher could use RDF to create its own terms to describe information. Until there is something like the Dewey Decimal System for the Web, looking for information across sites will likely remain a somewhat manual process.

The bookstore and library metaphor is also true in another way for describing the Web: The content of a bookstore is more fluid than a library, making card catalogs a lot more difficult to maintain. Maintaining metadata on a large, rapidly changing site would be an arduous task.

RDF is just a first step, but an important one. After that step, it will become necessary to agree upon more specific metadata formats and terms, such as creating something like Microsoft's Channel Definition Format for metadata.

With an agreed-upon metadata format, your Web browser may be able to do more for you automatically, making the Web a much more useful place.
Do you use metadata now? Tell me about it at esullivan@zd.com.

Copyright 1997 Ziff-Davis Inc. All Rights Reserved. Reproduction in whole or in part in any form or medium without express written permission of Ziff-Davis Inc. is prohibited. PC Week and PC Week Online and the respective logos are trademarks of Ziff-Davis Inc.

Send mail to PC Week