The Semantic Web: 1-2-3
This resource is also known as Stupid Berry Pickers Make Idiot
Jam and that fact should add suitable weight to the following
declaration: I'm new to the Semantic Web. I cobbled this fair piece
together in an attempt to collect my thoughts, answered questions,
path-of-learning, and requisite bookmarks so that other XML hackers may
follow in my footsteps. All inaccuracies are purely my fault, so be
sure to correct me.
This document is not intended to teach you RDF via my own words, but
rather to hand-hold you through the "good" parts of the same journey I
took. If it looks like a big link-list with menial comments from the
peanut gallery, then you're not far off the mark of my intent. This is
by no means definitive, nor was that the goal.
Table Of Contents
- An Introduction
- Recommended Reading
- Tips, Snippets, and Answers
- Where To Go From Here
- Sites I Almost Didn't Link To
- Credits
- Version History
An Introduction
I'm an XML hacker - with a few lines of Perl, I could get any piece
of information I wanted from any XML file I had. It got to a point where
I started writing my own XML documents willy-nilly, often for things
that didn't really deserve XML in the first place. XML was nifty, new,
and whoodoggery, "easy" to parse and read - I was suitably blinded by the
light of evolutionary tech.
My first encounter with RDF, the life force behind the Semantic Web,
was a year or so ago. I was immediately disgusted. This was the
greatness of "WWW, Version 2.0"? It was ugly, verbose, and "hard" to
read and write. I waved my hands around like a crotchety old man and
moved on to greener pastures - this was one upgrade I was willing to
skip, regardless of the immense powers it'd give me.
Realize that the above is the first of a "before and after" opinion
on something called "RDF/XML" (see below). Whilst I'm still not a fan of
RDF/XML, I now realize why it needs to be how it is, and it no longer
bothers me. The above was caused by Uncertainty and Denial:
- Uncertainty because the "XML" part of RDF/XML
told me that I could expect the standard XML based rules to apply
(specifically, where I could create any namespace with my own elements
and atributes). Then, after learning more about RDF and realizing that
the XML had to conform to a certain "triples" logic - even in my own
namespaces - it made me worried that I missed out on the grander
picture. In the limited amount of time I had to explore (way back
when) it seemed too much effort to learn a second layer of rules on
top of the standard XML.
- Denial because of the verbosity of RDF/XML.
Without trying to understand the format, I felt like saying "blech" -
it seemed like an awful lot of XML for just a single triple statement.
At the time, rudimentary searching told me that RDF was similar to XML
in that you could have multiple blocks of RDF/XML which meant the same
triple - which didn't seem to confirm some of the written benefits of
RDF that "the same thing that can be stated in XML tons of different
ways could only be stated in RDF/XML one way".
Recently, I've begun investigating RDF once again, fueled by a
lust for FOAF, or "friend of a friend" (see below). As the initial
repulsion once again surfaced, I pushed ever onward, intent on figuring
out the "back end" to my beloved FOAF. With more reading came more
comprehension, and these crucial concepts appeared, necessary for
my XML mind set to become undone:
- My initial repulsion with RDF was due to RDF/XML, an
implementation of RDF transferred through XML, usable by XML parsers,
but intended for those that understand the basics of RDF. RDF/XML is
the default implementation of RDF, and is meant to integrate with the
existing Net as smoothly as possible.
- Just as RDF/XML is ugly as sin, there are much prettier
implementations available (N3 and N-Triples) that make working with
RDF easier. Various utilities are also available to convert the joyous
N3 into RDF/XML, likewise with N-Triples. Soft introductions to these
other methods are available below.
- XML is meant for computers, with a frequent side-effect of being
readable to humans. The ugliness of RDF/XML is meant for computers.
Humans are not meant to read the RDF/XML, but they can still do so
with little difficulty past the initial "code shui" gagging.
- Much like XML has it's own set of rules (concerning namespaces,
element names, entities, etc.), RDF/XML adds to those rules. As such,
you can't just start adding XML to an RDF/XML document - there's a
good chance it'll no longer be valid RDF. All XML in an RDF/XML
document must conform to the concept of "triples" (explained below in
some of the beginning links).
- RDF/XML is still evolving. Some people involved with RDF are more
interested in implementations demonstrating the use of RDF than
solutions to some of the stickier problems that may or may not exist.
Realize that you're getting in "early" to an entirely different
ball game.
It was only after repetitive beatings of the above concepts into my
angry little brain did doors start opening to what RDF could really
do. I still think RDF/XML is ugly as sin - but the powerful benefits of
its scars outweigh the pain. Think "beautiful ducking, ugly swan".
Note: Anything that has been emphasized is quoted from
the source in question.
Recommended Reading
When I'm interested in something, I try to read everything I can find
on the subject. Thus, I started with a scant few bookmarks suggested by
others, started adding more (oh lord, many more), and as I
write this paragraph, have merely a few more sites of interest to go.
Below, I've collected and annotated some sites I find absolutely
requisite to "getting" all this. They're in order of reading /
comprehension level.
-
How Google beat
Amazon and Ebay to the Semantic Web - A work of fiction. A
Semantic Web scenario. A short feature from a business magazine
published in 2009. It's hard to believe Google - which is now the
world's largest single online marketplace - came on the scene only a
little more than 8 years ago, back in the days when Amazon and Ebay
reigned supreme. So how did Google become the world's single largest
marketplace? Well, the short answer is "the Semantic Web"...
- Making a
Semantic Web - If you've paid any attention to the web
standards discussions, you may have heard the phrase "Semantic Web",
or perhaps even been pressured to use standards with names like
"Dublin Core Metadata" or "RDF". If you've attempted to read any of
the available documentation on these topics, you were probably
intimidated by terms such as "reify" and all sorts of artificial
intelligence concepts. This document attempts to explain what all of
this chatter really means, and help you decide which parts you should
care about and why. I have tried to use common-sense, real-world
examples and stay away from complicated terminology.
- The Semantic Web
In Breadth - A soft introduction to the Semantic Web, covering the
basics of URIs (the things used to give anything an identifiable
resource), triples (the subjects, verbs, and objects that make up RDF
statements), and how they appear in both XML and RDF/XML. Also talks
briefly of RDF Schemas and DAML+OIL (both used to describe the RDF
language you create or use).
- The Semantic Web: An
Introduction - An excellent beginner, this document is
designed as being a simple but comprehensive introductory publication
for anybody trying to get into the Semantic Web: from beginners
through to long time hackers. It's aims of "comprehensive" and
"introductory" were met wonderfully, going further in-depth than the
in-breadth article above, including a decent "Further Reading".
- The Semantic
Web (for Web Developers) - A good document from the "In Breadth"
creator (above) on how to bring the Semantic Web into our web
applications. It starts with some of the basics (repetition is good,
grasshopper), jumps into some comparison with XML and SOAP Web
Services, and brings forth some applications using RDF right now. Also
introduces concepts such as grouping, aggregation, logic and
inference.
- Semantic Web
Points - The Semantic Web is an extension of the current web
in which information is given well-defined meaning, better enabling
computers and people to work in cooperation. The following is a
collection of key speaking points and phrases on the Semantic Web
given by members of the W3C Team or individuals associated with the
Semantic Web Education and Outreach interest group. Some
duplicates and dead links, but handy nonetheless.
- The Semantic Web,
Taking Form - A generic piece on the Semantic Web, focusing on
concept discussion as opposed to examples of triples or RDF/XML. Talks
a little bit about schemas and reinforces some of the goals and
concerns of RDF. Hence, the only real requisite for posting RDF
data is to make sure that it parses correctly, and requires minimal
human intervention. This is a major point if we want to be able to
create a machine readable Web of data!
- RDF Primer - Part
of the larger W3 RDF site, this
work-in-progress is a technical introduction to the Resource
Description Framework and is (comparatively) light reading. This
Primer is designed to provide the reader the basic fundamentals
required to effectively use RDF in their particular applications.
It starts with an introduction of concepts behind RDF and the Semantic
Web, jumps into some triples and how'd they look in RDF/XML, and chats
a bit about RDF Schemas and defining your own vocabulary. More
information about RDF/XML is available in the RDF/XML
Syntax Specification.
- Semantic Web and RDF
Hints and Tips - It is important that on the Semantic Web,
people produce data that is clean and interoperable. Some RDF
techniques can currently only be learned through the RDF community,
through hours of research, or through implementation experience, so
this is an attempt to gather some useful but quick hints and tips into
one place. Covers URI construction, RDF usage, and Schema/model
design.
Tips, Snippets, and Answers
These are in addition to those presented in Semantic
Web and RDF Hints and Tips. Most of these were yanked from my
questions on mailing lists, chat logs, and private correspondence. If
you've got any more suggestions to add below, email
the author of this document:
- The three parts of an RDF statement are officially known as
subject, predicate, and object. You may occasionally find these
referred to as subject, verb, and object (SVO), or rarely, an object
may be called a value. You want to think of them as subject,
predicate, object.
- The object of an RDF statement can be a literal string (a piece of
text like "hat" or "17"), or it can be a resource identified by a URI
(such as http://xmlns.com/foaf/0.1/Person
which defines the "Person" object of the FOAF vocabulary).
- Use any method to create RDF files, and then convert them to XML
RDF later on. If need be, model your languages using a simple
notation, and then convert later. For example, by using Notation3 or
NTriples, you can increase your productivity. Some people prefer to
use graphical interfaces than text, which is also acceptable.
- Semantic Web
and RDF Hints and Tips
- Basically, an XML RDF document is a collection of nodes, to
some extent. the element for creating a node is <rdf:Description>.
You know that all triples are comprised of subject, predicate, and
object, so the basic structure is: <rdf:Description rdf:about="subject">
<predicate> object </predicate> </rdf:Description> (for when the
object is a literal), and <rdf:Description rdf:about="subject"> <predicate
rdf:resource="object"/> </rdf:Description> for when the object is a URI.
- RDF/XML with multiple statements, converted to triples:
<!-- the subject is http://www.disobey.com/#Morbus -->
<rdf:Description rdf:about="http://www.disobey.com/#Morbus">
<foaf:nick>Morbus</foaf:nick>
<!-- triple: <...#Morbus> foaf:nick "Morbus" -->
<foaf:email rdf:resource="mailto:morbus@disobey.com" />
<!-- triple: <...#Morbus> foaf:email <mailto:morbus@disobey.com> -->
<foaf:gender rdf:resource="http://xmlns.com/foaf/0.1/Male" />
<!-- triple: <...#Morbus> foaf:gender <http://xmlns.com/foaf/0.1/Male> -->
</rdf:Description>
- More RDF/XML conversions: Basically, the rdf:type predicate is
special. When I have something like the below, I get the triple: <http://www.disobey.com/#Morbus>
rdf:type <http://xmlns.com/.../Person> .
<rdf:Description rdf:about="http://www.disobey.com/#Morbus">
<rdf:type rdf:resource="http://xmlns.com/.../Person" />
</rdf:Description>
But since rdf:type is used all the time, RDF/XML lets you
abbreviate the above to:
<foaf:Person rdf:about="http://www.disobey.com/#Morbus" />
It gives you the same triple (note that it's a closed element).
So, instead of rdf:Description, if you use a different element
name, that gives the rdf:type of the node. If you don't close the
element, you can put more predicates and objects inside, e.g.:
<foaf:Person rdf:about="http://www.disobey.com/#Morbus">
<foaf:nick>Morbus</foaf:nick>
</foaf:Person>
The matching triples are <http://www.disobey.com/#Morbus>
rdf:type <http://xmlns.com/.../Person> . and <http://www.disobey.com/#Morbus>
foaf:nick "Morbus" . And that's the same as:
<rdf:Description rdf:about="http://www.disobey.com/#Morbus">
<rdf:type rdf:resource="http://xmlns.com/.../Person" />
<foaf:nick>Morbus</foaf:nick>
</rdf:Description>
Where To Go From Here
So you've understood the material above, and you want to do more,
know more, build a better Web, blabberty blah blah. Below are some
resources that should get you started, be they current RDF
implementations of data and projects, launching off points for more junk
to read, or places to ask questions, discuss coding, etc.
- Creative Commons -
Metadata is "data about data." A library card catalog is an
example of everyday metadata you are probably familiar with. In the
same way that a card catalog provides records of the "authors,
subjects and titles" of books, Creative Commons' metadata will
represent the details of licensed works that reside on the "shelf" of
the web.
- Dave
Beckett's RDF Resource Guide - The mother lode of RDF links,
articles, tutorials, and more, all categorized, and updated
frequently. There's probably a ton of good stuff in here I should have
linked to in this document - but I didn't have that much time or
sanity.
- FOAF or "Friend of a Friend"
- FOAF is the primary reason this document exists. Much like anything
else in the Semantic Web, FOAF is still being developed, expanded and
documented, but the following resources should prove helpful:
Finding
friends with XML and RDF (an IBM developerWorks article), FOAF Web View (a web-based
viewer of known FOAF files), FOAF-a-matic
(an automated FOAF file creator), Syntax
Tips (on creating valid FOAF files), and usefulinc's
FOAF site (adding people, digitally signing your FOAF file, etc.).
- MusicBrainz -
MusicBrainz is a community music metadatabase that attempts to
create a comprehensive music encyclopedia. Automatic Audio CD and
digital audio track identification using community supplied and
maintained data is the first goal of MusicBrainz. More
information about their RDF
format is available here, and here's
an example of their RDF/XML.
- Primer: Getting
into RDF & Semantic Web using N3 - RDF/XML is the default
implementation of the core RDF concepts, but there are simpler, more
readable formats available. Notation 3 (or N3) is one such, and this
is a quick tutorial in the subject with a decent amount of
examples. See also: A Rough
Guide to N3.
- RDF APIs in
Perl - I'm a big Perl fan, but I still haven't found a
well-documented, truly cross-platform RDF parser that I'd want to call
home. Whilst I can read RDF/XML using an XML parser (like XML::Simple
or XML::Parser), I'd like more power (and more documentation and more
code examples and ...). Of the parsers available, two predominantly
stand out for me: RDFStore,
which is based around the expat XML parser and Redland,
which has a Perl interface to it's C library. RDF::N3
looks tasty as well.
- RDF in HTML -
Since there is no one standardized approach for associating RDF
compatible metadata with HTML, and since this is one of the most
frequently asked questions on the RDF mailing lists, this document is
provided as an outline of some RDF-in-HTML approaches that the author
is aware of. Covers eight different approaches of merging RDF
data in common HTML webpages, and then offers a personal opinion
about which two should be commonly supported.
- RDF Site Summary (RSS) 1.0
- An RDF version of the popular syndication format RSS (which has
multiple versions ranging from v0.90 to v0.94 and is collectively
referred to as v0.9x), RSS 1.0 is probably the largest in-use example
of RDF available, even though most RSS parsers only treat it as
vanilla XML. I've written and maintain my own RSS aggregator named
AmphetaDesk
- many others exist for many types of operating systems.
- RDFWeb - RDFWeb explores
some interconnected applications of the semantic web. It features a
distributed RDF-indexed photo archive, the friend-of-a-friend
information linking system, and various other fun things. Also
check out the introduction,
the rdfweb-dev
mailing list, #rdfig on
irc.openprojects.net, and the matching chump bot.
Of special interest is the co-
depiction demo, which is a visual six-degrees hack.
- Understanding the
Striped RDF/XML Syntax - This document provides a brief
introduction to the underlying structure of the RDF/XML 1.0 graph
serialization syntax. The Intended audience is mainly content and tool
developers familiar with XML basics, and with the RDF model, who want
a minimalistic understanding of RDF's XML syntax, so they can read and
write RDF/XML with more confidence.
Sites I Almost Didn't Link To
The below are resources that may not be relevant to your learning of
the Semantic Web and RDF. I've included them here mainly for my own
reference, as well as to serve as a curiosity factor for those who want
to dig a bit deeper. If you're bored with the above, the sites below
will even be less interesting.
- Annotated
DAML+OIL Ontology Markup - If you're going to be design your own
RDF namespaces to describe your data, you'll also want to look into
describing your language so that it's machine - processable. This quick
walkthru describes many of the features available, including classes,
properties and their restrictions, with examples throughout.
- Expressing
Qualified Dublin Core in RDF/XML - Dublin Core metadata is
used to supplement existing methods for searching and indexing
Web-based metadata, regardless of whether the corresponding resource
is an electronic document or a "real" physical object. Dublin Core
metadata provides card catalog-like definitions for defining the
properties of objects for Web-based resource discovery systems. -
Dublin
Core FAQ.
- Frequently Asked Questions
About RDF - Not many questions (and with two #3's, no less), but
there's a few bullet points of why you should be interested in RDF, as
well as a brief history with influences. Other than that, a couple of
links here and there, but as a definitive FAQ of design or
implementation issues, it's useless.
- Semantic
Web Road Map - A historical document from 1998, this is Tim
Berners-Lee's road map for the future, an architectural plan
untested by anything except thought experiments. While this
shouldn't be used as authorative for RDF nowadays, it's a decent read
on the beginnings, reasonings and design goals of the Semantic Web,
and the technology that should make it happen. Historical note
of interest, from Ian Glendinning: Whilst Tim Berners-Lee had the
vision to see that emergent internet technologies were making the
semantic web possible, the idea of calling the sum of knowledge in the
world "The Semantic Web" is in fact earlier, and comes from French
Post-Modernist Philosopher, Michel Foucault in "Les Mots et Les
Choses" (1966) aka "The Order of Things" (1970).
Credits
Thanks to Dave Beckett, Daniel Biddle, Ian Glendinning, Sean B.
Palmer, and Aaron Swartz, for looking this document over and giving
scathing reviews of disgust and distemper, along with helpful thoughts
and clarifications. Visual Credits: 1. Line drawing of Pljushkin, from a
story by Russian writer N. Gogol, 2. A Teenage Mutant Ninja Turtle,
Raphael in particular, 3. A photograph of Mister Ed, from the classic
sitcom "Mister Ed", 4. "Still Falling" by Antony Gormley 1983, at
Portland,
5. Unknown.
Version History
- 2002-09-09: Added note about origins of the term "Semantic Web", added
Making a Semantic Web,
and A Rough Guide to N3, and
clarified the reasonings why I didn't like RDF/XML in the first place (in
an attempt to meet new readers on level ground, as well as to stress I no
longer feel that way), and now correctly validates under XHTML/CSS.
- 2002-08-20: First draft posted (no archived version - lost it).
Morbus Iff (aka Kevin Hemenway), 2002-08