/usr/lib/info -- hacker/librarian haven
Front Page News Features
Software Events Standards
Ask Anything Opinion Diaries
Reviews . MLP . Everything
The Semantic Web and Libraries

By art, Section Opinion
Posted on Tue May 21st, 2002 at 12:12:12 PM EST
This is a column I am working on for InsideOLITA and would welcome any and all feedback.

 

Few topics cause more concern and confusion in the web community than the Semantic Web. The Semantic Web has been described as a vision of a web that goes beyond billions of linked web documents that lay in wait to be indexed by global search engines, it is a web where the semantics, or meaning, behind the content can be utilized in a meaningful way. To some, this hearkens back to the failed promises of Artificial Intelligence computing and the non-delivery of systems that were supposed to work out the family's budget and intelligently order groceries for the week. The World Wide Web Consortium's (W3C) extensive work on the Semantic Web has also been characterized as taking place in a semantic "cloud" that has obscured and detracted from much-needed web standardization efforts.

If you look beyond the hype, the Semantic Web can, in some ways, be seen as a natural progression that comes from building more capabilities into every new web technology. A simple sequence describing the evolution of the Semantic Web might begin with the chaotic stage of early HTML documents, where a minimal set of tags described all manner of content. Along the way, it was realized that it would be helpful to have concepts like "author" described in more meaningful tags than "h1" or "bold". XML emerged as the solution to ensure that the syntax and content of documents were consistent and to allow applications better ways of working with groups of documents that are authored for a common purpose, such as finding aids and full text materials marked up in TEI. XML uses constructs called DTDs and Schemas to tightly control the structure of documents and was met with great enthusiasm by web developers who could now share information using tags with labels like "subject" that better reflect the content itself.

XML is arguably a key building block in the Semantic Web but the first real manifestation of the W3C's semantic work was the publication of the Resource Description Framework (RDF) specification for encoding and sharing metadata. Metadata is sometimes called "data about data" and has been one of the main activities of libraries for several centuries. The premise of RDF is that metadata can be modeled as a set of statements that indicate a piece of information about something else. In RDF parlance, these are called "triples'. For example, the statement "Tim Severin is the creator of the Brendan Voyage" consists of three parts (Tim Severin, Creator, Brendan Voyage) and can be written with RDF in XML as:

<rdf:RDF xmlns:rdf="http://www.w3.org/TR/WD-rdf-syntax#" xmlns:dc="http://purl.org/dc/elements/1.0/">
<rdf:Description rdf:about="http://address_for_Brendan_Voyage">
<dc:Creator>Tim Severin</dc:Creator>
</rdf:RDF>

This type of statement is called an assertion and RDF specifies that every part of the assertion can be assigned a URI (Uniform Resource Indicator), much like a URL but different in the sense that it doesn't have to map to a real web address and can represent concepts ("Creator"), living entities ("Tim Severin"), and anything else in the known and imagined universe, from animals to laundry lists. The "dc" in the example stands for Dublin Core and is associated with a special URI called a namespace ("http://purl.org/dc/elements/1.0/ ") that, in turn, is associated with a set of metadata elements. On its own, this is somewhat useful, but one of the most compelling aspects of RDF is combining elements from different metadata sets. If I had a set of elements specifying a rating system, for example, I could insert a namespace (xmlns) reference that would allow me to insert my rating as shown:

<rdf:RDF xmlns:rdf="http://www.w3.org/TR/WD-rdf-syntax#" xmlns:dc="http://purl.org/dc/elements/1.0/" xmlns:ar="http://www.for.me/ar/elements/">
<rdf:Description rdf:about="http://address_for_Brendan_Voyage">
<ar:Rating>Excellent</ar:Rating>
<dc:Creator>Tim Severin</dc:Creator>
</rdf:RDF>

RDF detractors are quick to point out that this type of "mixing and matching" for metadata has been slow to ignite the kind of interest that has followed HTML and XML. While there is no doubt that RDF has not captured as much of the spotlight, it is worth noting that:

  • RDF is concerned with metadata, which isn't always appreciated if you don't have occasion to ponder information retrieval or if you think that keyword indexing can solve most information needs.
  • The syntax is somewhat convoluted, even compared to HTML and XML, and may be better represented by labeled graphs or other techniques common in Computer Science but often confusing to the novice. Tim Berners-Lee, the inventor of the World Wide Web, has proposed a much simpler syntax for RDF called Notation 3 which looks something like:
:tim :creator "The Brendan Voyage" .

In addition to the need to appreciate metadata and the syntax issues, another difficulty with the Semantic Web is that RDF is only the first step along the way. Going beyond assertions to supporting any high level of inferences, where a computer can automatically pull together concepts, really requires some understanding of RDF Schemas and Ontology Languages like DAML+OIL. RDF Schema allows concepts to be specified and related, for example, specifying that a "writer" is a type of "creator". Ontologies are also formal representations of entities and concepts, and languages like DAML+OIL are different from RDF Schema in the sense that they provide even more options for defining relationships. For example, using Notation 3, we could have this relationship:

dc:Creator daml:equivalentTo red:PreparerName .

This would allow a program to "infer" that a real estate agreement identified with the "PreparerName" element from the Real Estate Data (red) Consortium schema is equivalent to "Creator" from Dublin Core using the "equivalentTo" property from DAML+OIL. This means that in addition to titles of monographs that the author I am researching has written, I could also receive documents that represent the author's activities as a lawyer from a semantically-aware library system.

RDF Schemas and ontology work are crucial to the success of the Semantic Web, and have tended to emerge in subject areas that lend themselves well to defining relationships between concepts, for examples, dictionaries and vocabularies, thesauri, and many branches of science. For libraries, the value of the Semantic Web may have less to do with changes in bibliographic databases than with integrating resources that don't often show up in traditional cataloguing. Scientific datasets, for example, often don't have access points that translate well to bibliographic descriptions and bring in a multitude of concepts that may be critical for the resource community the datasets are produced for. DNA sequences, solar wind movements, and other types of scientific data require specialized query languages. RDF holds the promise of wiring in the metadata and schema/ontologies that address the complexity of the semantics of the data rather than trying to cram this level of description into Dublin Core or MARC.

Another intriguing use of Semantic Web activity is to tie together library functions with external systems. For example, expanding on the work of the RDF Calendar initiative to support queries like "find me all the works on XML that are due in the library before I go on vacation". The Semantic Web could provide the plumbing to allow a system to talk to an individual's RDF-enabled calendar system to determine the timeframe identified by the use of the term "vacation". RDF and Semantic Web-based query languages offer a glimpse of how the semantics/vocabularies of different research communities may be combined in supporting information retrieval. It isn't likely that the results will come close to the early promises of Artificial Intelligence but libraries are in a somewhat unique position to both appreciate the importance of sharing metadata, and understand the benefits of interoperable vocabularies and semantics better than most organizations. The Semantic Web may turn out to be far less audacious in practice than in concept, but it could be an important tool for trying to provide services for the growing stream of diverse web-based content and services that flows by our libraries.

< MD3 - The Metadata3 Project (9 comments) | JavaWorld article on SAX Parsers (1 comments) >

Menu
submit story
create account
faq
search
recommended reading
editorial guide
masthead

Login
Make a new account
Username:
Password:

Poll
Do you think the Semantic Web will have any impact on libraries?
Yes 50%
No 0%
Slightly 50%

Votes: 4
Results | Other Polls

Related Links
Semantic Web
World Wide Web Consortium
HTML
XML
TEI
DTDs and Schemas
Resource Description Framework
Uniform Resource Indicator
Dublin Core
namespace
labeled graphs
Tim Berners-Lee
Notation 3
RDF Schemas
Ontology
DAML+OIL
Real Estate Data (red) Consortium
RDF Calendar
RDF and Semantic Web-based query languages
More on
Also by art

View: Display: Sort:
The Semantic Web and Libraries | 9 comments (9 topical, 0 editorial, 0 pending) | Post A Comment
Nice summary... (5.00 / 1) (#4)
by dchud on Wed May 22nd, 2002 at 10:54:23 AM EST
(User Info) http://curtis.med.yale.edu/dchud

I agree! I'm particularly pleased with your limiting of expectation, narrowing the likely scope of benefit to things we currently consider ancillary. And the slow takeup due to mix-n-match and syntax issues.

Some questions I'd love to see answers for (though maybe they're beyond the scope of your piece :):


  • The audaciousness of the project seems to have been fueled by the "I know what you mean" attitude that pervades the hype, esp. in the Scientific American cover piece. What makes them think they can do a better job of mimicking "intelligence" now than the mostly failed attempts of yesteryear? And why would we really want or need that in our society?


  • The ontologic mix-n-match prospect seems to wantonly ignore the fact that different ontologies change at different rates along different dimensions. You could create a nice blending of three or four worldviews today using DAML etc. but who's to say that blending would hold any water a week, a year, a decade later? Just look at changes to LCSH over time, the successes, the failures, the issues still unresolved, and consider how the scope of the problem necessarily increases exponentially when you cross those issues with comparable ones from other ontologies. What a mess!


[ Reply to This ]


Great write-up (4.00 / 1) (#1)
by jfrumkin on Wed May 22nd, 2002 at 07:01:02 AM EST
(User Info) http://digital.library.arizona.edu/~jfrumkin

Art -

This is an excellent write-up. The only suggestion I would have is to detail out your example in the last paragraph - provide a bit more concrete detail (using RDF examples) on the RDF calendar example. -- JF

[ Reply to This ]


Great article (4.00 / 1) (#6)
by ksclarke (ksclarke @ stanford no spam dot edu) on Wed May 22nd, 2002 at 04:08:21 PM EST
(User Info) http://www.stanford.edu/~ksclarke

Even the skeptic in me feels a bit convinced. A small sentence level suggestion: change "In addition to the need to appreciate metadata and the syntax issues," so that there are not two "to"s in a row there and I think it might flow a little better.

Great article though... and Dan, your comments are equally interesting... the changing over time of these mappings/relationships is something that deserves our attention. I wonder about our ability to automatically compensate for this... we probably will always need catalogers (or people of some sort) to do the bridge work.

Anyway, good stuff!

Kevin

--

--
Out, out brief candle!
... it is a tale told by an idiot, full of sound and fury,
Signifying nothing.
[ Reply to This ]


Here is good website (none / 0) (#7)
by Anonymous Hero on Tue Feb 22nd, 2005 at 11:38:36 PM EST

Here is good website I will introduce it to my friends website imiquimod pic bbs Article links sitemap sitemap2 links add Health links all Article hpv net freewebpage1 seocn googlecn aakkfree healthcn hpvcn szseo stds ccctvxxyyzz aaccoo xbcnorg aakkorg google aakkorglink 1 2 3 4 5 6 7 8 9 10
cctv

[ Reply to This ]


You may find it interesting to check the sites ab (none / 0) (#8)
by Anonymous Hero on Thu Mar 3rd, 2005 at 01:16:46 AM EST

You may find it interesting to check the sites about 闲暇时光饰品网新品上架商品分类销售排行榜 特价商品 意见反馈 加盟商区水晶论坛香水树品牌饰品 施华洛世奇水晶施华洛世奇水晶吊坠 其他穿珠扣件类 连接配件DIY专业工具 各国特色穿珠 线材 施华洛世奇首饰和摆件 常见问题串珠搜索引擎 友情链接 所有商品 网站新闻

[ Reply to This ]


kdol (none / 0) (#9)
by Anonymous Hero on Sun Mar 20th, 2005 at 09:44:22 PM EST

I am very intersted in your subject, but i'm a beginner. So i hope we can often intercourse. Best Regard!!

[ Reply to This ]


The Semantic Web and Libraries | 9 comments (9 topical, 0 editorial, 0 pending) | Post A Comment
View: Display: Sort:

Powered by Scoop
All trademarks and copyrights on this page are owned by their respective companies. Comments are owned by the Poster. The Rest 2002 The Management

front page | submit story | create account | faq | search