uri crisis - what do URIs identify?
There are nifty approaches about to cure the identity crisis (like this here) but they all fail the problem because the problem is much deeper.
I usually then write one email saying "uri crisis again" to point out that the problem is unsolved.
So, what is the Uri crisis about?
basically, this text TimBl has written describes it best:
identify?
- A whale
- "Moby Dick or the Whale" by Herman Melville
- A web page on Amazon offering a book for sale
- A URI string
- All the above
Have you read the thing it identifies?
it is part of the article "what do HTTP Uris identify?"
So we don't have good uris to identify people, concepts, books, etc. Because a Uri has more than one meaning.
This is explained very good by David Booth's article
"Four Uses of a URL: Name, Concept, Web Location and Document Instance" Coming to this conclusion
One point seems clear. In using URLs to identify concepts (such as "http://x.org/love"), we need conventions for denoting each of these four things: name, concept, Web location and document instance.
Then there was also an article how the semantics of Topic Maps could help by Steve Pepper, titled "Curing the Web's Identity Crisis".
In his introduction Pepper writes:
In an important recent article on XML.com entitled "Identity Crisis" [Clark 2002], Kendall Clark addresses the issue of "identity" as it pertains to the World Wide Web. Clark quotes the
description of the Web by the W3C's Technical Architecture Group (TAG) in Architecture of the World Wide Web [Jacobs 2002], as a "universe of resources", where "resource" is to be understood according to the definition given in [RFC 2396] as being "anything that has identity". Clark points out that the concept of "identity" itself is nowhere defined and moreover is severely problematic.
He cites the Article "Identity Crisis" by Kendall Grant Clark. In his introduction to the problem, Clark says:
The Identity of Resources.
In the APW's view, the Web is a "universe of resources". So far, so good. But what is a resource? The APW adopts the definition of resource from RFC
2396, a definition which has always made me uneasy, though probably because I'm still more inclined to think of these things like a philosopher than like a programmer or software system architect.
So, it is a philosophical problem. Ah. Now we come somewhere. Sadly, every time leobard tried to get a philosopher on this track, saying things like "I think that URIs will change the way we identify abstract concepts, a change that is fundamental to our constructivistic worldview", philosophers answer: you young nerd, read 10 kilos of philosophical books and come back. Sure - but I won't spend no time on that.
So - face it. The meaning of what a URI identifies os not defined. Hence, when TimBl announces he has a URI now and a Foaf file - what does this mean?
That we should identify the concept "Person named Tim Berners-Lee" using this uri?
perhaps, and perhaps thats the way it works: you explicitly say to identify the concepts you have in your mind using the URI you find most approapriate. When other humans copy your behaviour (and copy/paste your uri), URIs will identify concepts. Hm, perhaps.
So, next time when I shout "You are facing the Uri crisis", don't answer "I never heard there was a crisis" or "are we out of uris?" and think of a solution instead.
leobard - 18. Nov, 11:12
|
Here is a solution: Use Public Names
So, I invented a semantic data model that is backward compatible with UTF-8 URI, called Public Names:
http://laurentszyster.be/blog/public-names/
and which can be used to identify unambiguously a resource's name, not just its location. For instance, your article's title, encoded as a Public Name, would yield the following:
15:6:crisis,3:uri,,30:4:URIs,2:do,8:identify,4:what,,
which represents unambiguously the articulated set of 8-bit byte strings:
(("uri", "crisis"), ("what", "do", "URIs", "identify"))
and captures most of the semantic relations expressed between the text strings that make up your title:
"uri" is related to the set ("uri", "crisis")
"crisis" is related to the set ("uri", "crisis")
("uri", "crisis") is related to the set (("uri", "crisis"), ("what", "do", "URIs", "identify"))
etc ...
in a much simpler and practical way than RDF/XML named graphs.
Regards,
URIs are symbols of a global language
Now let's relax and create a quick and easy social convention to mint URIs for things and web pages. We need them!
the same comment with links at:
http://xamde.blogspot.com/2005/11/uri-crisis-solved.html#c113234222460719044
remarks
jcw has suggested something that could scale, but you have to be clear that it will scale up to wikipedia level. So what I know from XamDe's previous work, I think that Xamde is up to something in this direction. Basically, the idea is: Wikipedia is socially accepted so we could take it to have URLs that identify concepts like love, etc.
Wikipedia is probably the only single-point solution for this.
So starting something new at http://laurentszyster.be/blog/public-names/ won't help, especially when you have this already, but made by others and heavily used. To identify words, you would usuallly use the wordnet mapping.
http://xmlns.com/wordnet/1.6/dog
ok, also the comment by Xamde in his blog titled "we solved the uri crisis" does not solve the uri crisis as I see it. It just proposes something many (MANY) people have proposed before, but which didn't make it. The social thing has to be identical to the RDF thing, so different URIs in the browser and for RDF are somewhat hindering.
We do need something new
What I try to do with Public Names, Public RDF, the Public Name System and their reference implementation is to answer a though question:
"how do you store a web of textual information?".
And as you pointed out the hypertext links we use at the moment don't mean anything, they are arbitrary pointers, meaningless.
The semantic web does lack a model for simple computer system that can express semantic relations in a URI.
So, what do we human use to express semantic relations?
We articulate, use white spaces and punctuation as we write, pause and ryhmes as we speak, etc. We articulate sets of words, like for instance this painting title, its english translation and the author's name:
Ceci n'est pas une pipe (This is not a pipe) - Magritte
Public Names is a simple protocol to encode articulated sets of 8-bit byte strings and validate them as unambiguous, semantically non dispersed.
For instance, the above short description of the painting can easely be articulated as the Public Name (CRLF and indentation added for readability):
44:
34:4:Ceci,5:n'est,3:pas,4:pipe,3:une,,
22:4:This,1:a,2:is,3:not,,
,
8:Magritte,
The benefit of using a URI like:
http://.../34:4:Ceci,5:n'est,3:pas,4:pipe,3:une,,8:Magritte
is the simplicity of its implementation for metadata producers and consumers and all the possible applications of meaningfull links.
The Public Names protocol implementation itself is trivial and a simple stack of Regular Expressions can allready do a lot to articulate a single language. For shorter text, like a name or an article title, an even simpler lexer may be practical.
Instead of a meaningless pointer, Public Names users have a well articulated set that express semantic relations between text. They have at last a meaningfull pointer which can be validated as well articulated and fed into an index.
social thing? RDF thing?
What is the an RDF thing? A resource, identified by a URIref and described with RDF properties? What is a social thing? A name with some shared understanding attached? What is a browser? A tool to render web pages for given URLs?
Given this, the social thing 'Dog' could be identified by a URI, e.g. 'http://en.wikipedia.org/wiki/Dog'. A browser will return a human-readable description about the concept Dog. Another social thing is the Wikipedia page about dogs. Again, one could suggest the URI 'http://en.wikipedia.org/wiki/Dog' for that. We can handle that, but RDF can't. So in order to have a usable, non-ambigouos RDF model, we must use different URIs for different concepts. And we should choose them with care, so that humans who indepentenly search for URIs have few nameclashed and lots of overlap. As wordnet and wikipedia are outstanding central repositories for socially constructed URLs, we would like to re-use them. That's how we came up with the idea to re-use these URIs and simply add a '#' for the concept.
Book
I'm going to keep well clear of this discussion, except to say the word "crisis" may be a little strong - there's a lot can be done with only a vague notion of what URIs actually mean. Anyhow, you may want to check out this book:
http://www.aber.ac.uk/media/Documents/S4B/the_book.html
"Solving" the URI crisis step by step
We need a QUICK and EASY and SHARED way to create and use Semantic Web REPRESENTATIVES for things and concepts of our wold.
Wikipedia and Wordnet have already what it takes to DEFINE and SHARE representatives of "nearly everything" on this planet. The important thing here is: Everybody KNOWS that information source! So, using Wikipedia is much better than creating your own Wiki and representing everything there (AGAIN!) - nobody would know about your Wiki and nobody would care about it (sorry) and nobody would commit on your concepts and your definitions. The most important thing here is, that people SHARE and COMMIT on the URIs (here: URLs) you use!!
2. Now, let's talk about SEMANTICS.
That's often the point, where ontology experts and philosophers are interested in. Alas, in most cases, they are that much interested in semantics, that they kill every first working idea at first sight...
First of all, Wikipedia already delivers good semantics for HUMANS. IMHO, that is a good starting point to build up at least human-readable and -usable "data" or even "information" for the Semantic Web. We urgently need that build-up phase to awaken the Semantic Web! At least the use case of human-triggered but still lame "semantic" search should be possible as soon as possible.
3. SEMANTICS for MACHINES.
We do not want to, but we CAN wait until step three to care about MACHINE-enabled INFERENCING and high PHILOSOPHICS. The advantage of RDF is, that you can come up with multiple descriptions of one resource. So, why not deliver the FORMAL semantics of http://en.wikipedia.org/wiki/Semantic_web IN ADDITION to the english and german informal text?!
If a standardized inferencing language and service becomes available, we DO want to have all the machine-processible, formal semantics and inferences possible!! But then, we can already use the previously created RDF descriptions and relations. They won't change over night! The formal descriptions are "just" additional set of statements that are now available, too!
TimBl is right, every URI should better be a URL, too. That way, we can ALREADY use the http protocol to retrieve human-readable descriptions TODAY and that way garantee a SHARED UNDERSTANDING of the used concepts BETWEEN HUMANS.
As soon as formality and inferencing emerges, the machines can THEN also share and understand OUR concepts and resources of the human race ;-)
Have a nice day!
Sven Schwarz (BrettspielBrowser)
frustrated :(
actually, although i'm really thrilled by the vision of a semantic web and try to use pieces of it every now and then, i always stop working on them when i realize that RDF gives me the grammar, but lacks the words. it's even more frustrating to see a semantic web community happily discussing the big picture and getting all excited about it, while there's little me sitting there wondering wether, within this framework, one will ever be able to unambiguously identify anything.
it seems someone's building a house here and, while noticing that there's something wrong, keeps telling us "I'll decide about the bricks later, but look at those beautiful curtains! and wait until you've seen my china!"
My personal feeling is that all went wrong when they expanded the (more or less) well defined URL into the concept of URIs. I can see the idea behind it, i know what it was supposed to be, but it simply doesn't work i'm afraid. it's too generic. and it would have been good to clearly separate the concept of a URL, a retrievable network location, from anything else.
Nathalie Bed Bugs or Dust Mites - how to identify and deal with bed bugs
good blog post.
Window Awnings For Home
blue eyed question
Why don't we create - lets say a wiki - at for example www.abstract-concepts.org or something like that and put alle the abstract concepts like love, friendship, war or whatever on pages there. So you could write a little definition for each (and people could add something - hence the wiki) and everybody could use the URIs there. Perhaps we could even use wikipedia for that?
I understand that this is a little bit different from the concept of "Person named Tim Berners-Lee", but it would at least give us something for the more general concepts, wouldn't it?