Dereferenceable Uniform Resource Identifier
A dereferenceable Uniform Resource Identifier or dereferenceable URI is a resource retrieval mechanism that uses any of the internet protocols (e.g. HTTP) to obtain a copy or representation of the resource it identifies.
In the context of traditional HTML web pages, this is the normal and obvious way of working: A URI refers to the page, and when requested the web server returns a copy of it. In other non-dereferenceable contexts, such as XML Schema, the namespace identifier is still a URI, but this is simply an identifier (i.e. a namespace name). There is no intention that this can or should be dereferenced. There is even a separate attribute, schemaLocation
, which may contain a dereferenceable URI that does point to a copy of the schema document.
In the case of Linked Data, the representation takes the form of a document (typically HTML or XML) that describes the resource that the URI identifies. In either case, the mechanism makes it possible for a user (or software agent) to "follow your nose" to find out more information related to the identified resource.
Background
In computing, identifiers are used to distinguish things and to facilitate data exchange. For example, two US citizens of the same name would have different SSN. In a totally distributed system, such as the World Wide Web, a URI is used to globally identify a thing in the world. Because the architecture and decision is made for HTTP, URIs often identify the web pages instead of the underlying thing. To remove this confusion, URIs that identify things often include a hash (see the following section). The following example shows the difference of a URL of a person (which usually means his/her homepage) and a URI of a person:
- Dan Connolly's URL is "http://www.w3.org/People/Connolly/". It identifies his homepage, which was created in 1994. If computer A asks computer B "How old is http://www.w3.org/People/Connolly/"?Computer B might answer "16" (in the year 2010).
- Dan Connolly's URI is "http://www.w3.org/People/Connolly/#me". It identifies him, a person. If computer A asks computer B "How old is http://www.w3.org/People/Connolly/#me". Computer B might answer "35".
Because of the nature of a URI, it can be dereferenced to get the information of the thing it represents—hence the term dereferenceable URI. SSN and a person's name are not dereferenceable because, even though you could search for these strings on the Web, it is not guaranteed that the information exists and is unambiguous. In other words, there is no canonical way of dereferencing those identifiers. On the other hand, URIs can be dereferenced by standardized protocols such as HTTP.
Dereferenceable URIs are based on the well-established theory and practices of "data access by reference". A data access and manipulation mechanism is used extensively in general computer programming (e.g., C/C++ pointers) and database call level interfaces (e.g., ODBC and JDBC) amongst others. The term: dereferencing describes the act of obtaining a representation of a description of an entity via its URI.
In the Semantic Web realm, dereferenceable URIs offer the critical fabric that drive the Giant Global Graph of interconnected data popularly referred to as Linked Data, another term coined by Tim Berners-Lee in his Linked Data Design Note[1] and furthered by other articles such as "Cool URIs for the Semantic Web" by Sauermann and Cyganiak.[2]
Eventually everything will have its dereferenceable URI,[3] but things that already have URIs and described in interoperable way at this moment are:
- People – defined in the FOAF vocabulary. For example, Tim Berners-Lee has the URI http://www.w3.org/People/Berners-Lee/card#i.
- Organization - defined in the FOAF vocabulary. For example, W3C has the URI "http://www.w3.org/data#W3C".
- Software project - defined in the DOAP vocabulary. For example, Tabulator has the URI "http://dig.csail.mit.edu/2005/ajar/ajaw/data#Tabulator".
Formats
Dereferenceable URIs are constructed using one of two forms: Hash or a Slash. The critical thing about either format is the underlying use of existing Web architecture to preserve the implicit identity (or pointer) function.
Examples of use with entities Berlin and Paris:
- Hash URI examples:
http://linkeddata.openlinksw.com/about#Berlin
orhttp://linkeddata.openlinksw.com/about#Paris
- Slash URI examples:
http://dbpedia.org/resource/Berlin
orhttp://dbpedia.org/resource/Paris
On URL syntax the /ID is a preserved part of the URL, but #ID is a fragment identifier, that can be discarded (not participate in HTTP redirections).
Summary
In summary we can establish the following facts:
- A dereferenceable URI is a kind of Uniform Resource Identifier (but is accessible via HTTP).
- A dereferenceable URI is a kind of reference (as found in existing computer science theory and practice).
References
- ↑ Berners-Lee, Tim (2006), Design Note: Linked Data, W3C, retrieved 2008-07-21
- ↑ Sauermann, Leo; Cyganiak, Richard (2008), Cool URIs for the Semantic Web, W3C, retrieved 2008-07-21
- ↑ Berners-Lee, Tim (2006), Give yourself a URI, DIG, retrieved 2009-01-14
Further reading
- Hash vs Slash URI
- Berners-Lee, T.; Fielding, R.; Masinter, L. (2005), Uniform Resource Identifier, The Internet Society, retrieved 2008-07-21
- Lewis, Rhys (2007), Dereferencing HTTP URIs, W3C, retrieved 2008-07-25
|