semantic weltbild 2.0: uri crisis - what do URIs identify?

Friday, 18. November 2005

uri crisis - what do URIs identify?

Still, we don't use the Semantic Web in broad and I think one problem is, that we don't find the right uris to identify ideas/people/things - concepts from the real world. The discussion about the URI crisis does not happen in conferences and articles, but everytime somebody proposes a new uri scheme to identify books, lifescience terms, etc. Then masses of people flame each other on mailinglists.

There are nifty approaches about to cure the identity crisis (like this here) but they all fail the problem because the problem is much deeper.

I usually then write one email saying "uri crisis again" to point out that the problem is unsolved.

So, what is the Uri crisis about?

basically, this text TimBl has written describes it best:

What does "http://www.amazon.com/exec/obidos/ASIN/0679600108/qid=1027958807/sr=2-3/ref=sr_2_3/103-4363499-9407855"
identify?

A whale
"Moby Dick or the Whale" by Herman Melville
A web page on Amazon offering a book for sale
A URI string
All the above

When was the thing it identifed last changed?
Have you read the thing it identifies?

it is part of the article "what do HTTP Uris identify?"

So we don't have good uris to identify people, concepts, books, etc. Because a Uri has more than one meaning.

This is explained very good by David Booth's article
"Four Uses of a URL: Name, Concept, Web Location and Document Instance" Coming to this conclusion
One point seems clear. In using URLs to identify concepts (such as "http://x.org/love"), we need conventions for denoting each of these four things: name, concept, Web location and document instance.

Then there was also an article how the semantics of Topic Maps could help by Steve Pepper, titled "Curing the Web's Identity Crisis".
In his introduction Pepper writes:

In an important recent article on XML.com entitled "Identity Crisis" [Clark 2002], Kendall Clark addresses the issue of "identity" as it pertains to the World Wide Web. Clark quotes the
description of the Web by the W3C's Technical Architecture Group (TAG) in Architecture of the World Wide Web [Jacobs 2002], as a "universe of resources", where "resource" is to be understood according to the definition given in [RFC 2396] as being "anything that has identity". Clark points out that the concept of "identity" itself is nowhere defined and moreover is severely problematic.

He cites the Article "Identity Crisis" by Kendall Grant Clark. In his introduction to the problem, Clark says:
The Identity of Resources.
In the APW's view, the Web is a "universe of resources". So far, so good. But what is a resource? The APW adopts the definition of resource from RFC
2396, a definition which has always made me uneasy, though probably because I'm still more inclined to think of these things like a philosopher than like a programmer or software system architect.

So, it is a philosophical problem. Ah. Now we come somewhere. Sadly, every time leobard tried to get a philosopher on this track, saying things like "I think that URIs will change the way we identify abstract concepts, a change that is fundamental to our constructivistic worldview", philosophers answer: you young nerd, read 10 kilos of philosophical books and come back. Sure - but I won't spend no time on that.

So - face it. The meaning of what a URI identifies os not defined. Hence, when TimBl announces he has a URI now and a Foaf file - what does this mean?

That we should identify the concept "Person named Tim Berners-Lee" using this uri?

perhaps, and perhaps thats the way it works: you explicitly say to identify the concepts you have in your mind using the URI you find most approapriate. When other humans copy your behaviour (and copy/paste your uri), URIs will identify concepts. Hm, perhaps.

So, next time when I shout "You are facing the Uri crisis", don't answer "I never heard there was a crisis" or "are we out of uris?" and think of a solution instead.

leobard - 18. Nov, 11:12

12 comments - add comment

QR barcode by i-nigma.com/CreateBarcodes

jcw - 18. Nov, 13:20

blue eyed question

I'm pretty new to the semantic web and perhaps I miss something here, so please forgive me if this is a stupid question, but:

Why don't we create - lets say a wiki - at for example www.abstract-concepts.org or something like that and put alle the abstract concepts like love, friendship, war or whatever on pages there. So you could write a little definition for each (and people could add something - hence the wiki) and everybody could use the URIs there. Perhaps we could even use wikipedia for that?

I understand that this is a little bit different from the concept of "Person named Tim Berners-Lee", but it would at least give us something for the more general concepts, wouldn't it?

laurentszyster - 18. Nov, 18:52

Here is a solution: Use Public Names

As Tim Berner Lee himself noted, Uniform Resource Identifier (URI) do not carry any meaning. The fact is that URI were designed to be ... Uniform Resource Locator (URL), not identifier.

So, I invented a semantic data model that is backward compatible with UTF-8 URI, called Public Names:

http://laurentszyster.be/blog/public-names/

and which can be used to identify unambiguously a resource's name, not just its location. For instance, your article's title, encoded as a Public Name, would yield the following:

15:6:crisis,3:uri,,30:4:URIs,2:do,8:identify,4:what,,

which represents unambiguously the articulated set of 8-bit byte strings:

(("uri", "crisis"), ("what", "do", "URIs", "identify"))

and captures most of the semantic relations expressed between the text strings that make up your title:

"uri" is related to the set ("uri", "crisis")

"crisis" is related to the set ("uri", "crisis")

("uri", "crisis") is related to the set (("uri", "crisis"), ("what", "do", "URIs", "identify"))

etc ...

in a much simpler and practical way than RDF/XML named graphs.

Regards,

xamde - 18. Nov, 20:29

URIs are symbols of a global language

While thinking about the right way to encode RDF generated from our Semantic Wikipedia I happened to stumble in the URI crises again. For Wikipedia, we solved it more or less, but not in general. The general solution is even simpler: URIs are symbols, RFD is the grammar (s-p-o). The meaning of URIs is completely depending on the social process around them. If I create a URI for myself, than it might have meaning for me, if millions of people agree to use the same URI for e.g. Google, then the URI means Google. It depends on the people whether it means Google, the website or Google, the company. It's not the choice of semantic web researchers. But (!) we can give good advice, what to use for what, in oder to speed up the URI-consensus process. Ah, an then is't funny: Some URIs, if you type them in your browser, show you a web page. Thats about as funny as doing a Google query on a words spelled backwards and wonder what comes up. The meaning of URIs has not more relation to the web pages that might come up than we assign them.

Now let's relax and create a quick and easy social convention to mint URIs for things and web pages. We need them!

the same comment with links at:

http://xamde.blogspot.com/2005/11/uri-crisis-solved.html#c113234222460719044

Kerson - 3. Sep, 11:38

cialis dose pack 20mg
levitra dose pack
where buy testosterone
where buy nexium
where buy acomplia
prednisone dose pack
clomid 50mg
levitra professional
where buy aciphex
where buy propecia

leobard - 20. Nov, 09:57

remarks

thank you for the comments,

jcw has suggested something that could scale, but you have to be clear that it will scale up to wikipedia level. So what I know from XamDe's previous work, I think that Xamde is up to something in this direction. Basically, the idea is: Wikipedia is socially accepted so we could take it to have URLs that identify concepts like love, etc.

Wikipedia is probably the only single-point solution for this.

So starting something new at http://laurentszyster.be/blog/public-names/ won't help, especially when you have this already, but made by others and heavily used. To identify words, you would usuallly use the wordnet mapping.
http://xmlns.com/wordnet/1.6/dog

ok, also the comment by Xamde in his blog titled "we solved the uri crisis" does not solve the uri crisis as I see it. It just proposes something many (MANY) people have proposed before, but which didn't make it. The social thing has to be identical to the RDF thing, so different URIs in the browser and for RDF are somewhat hindering.

Laurent Szyster (guest) - 21. Nov, 20:39

We do need something new

I'm not trying to map words, I'm not a taxonomist :-)

What I try to do with Public Names, Public RDF, the Public Name System and their reference implementation is to answer a though question:

"how do you store a web of textual information?".

And as you pointed out the hypertext links we use at the moment don't mean anything, they are arbitrary pointers, meaningless.

The semantic web does lack a model for simple computer system that can express semantic relations in a URI.

So, what do we human use to express semantic relations?

We articulate, use white spaces and punctuation as we write, pause and ryhmes as we speak, etc. We articulate sets of words, like for instance this painting title, its english translation and the author's name:

Ceci n'est pas une pipe (This is not a pipe) - Magritte

Public Names is a simple protocol to encode articulated sets of 8-bit byte strings and validate them as unambiguous, semantically non dispersed.

For instance, the above short description of the painting can easely be articulated as the Public Name (CRLF and indentation added for readability):

44:
34:4:Ceci,5:n'est,3:pas,4:pipe,3:une,,
22:4:This,1:a,2:is,3:not,,
,
8:Magritte,

The benefit of using a URI like:

http://.../34:4:Ceci,5:n'est,3:pas,4:pipe,3:une,,8:Magritte

is the simplicity of its implementation for metadata producers and consumers and all the possible applications of meaningfull links.

The Public Names protocol implementation itself is trivial and a simple stack of Regular Expressions can allready do a lot to articulate a single language. For shorter text, like a name or an article title, an even simpler lexer may be practical.

Instead of a meaningless pointer, Public Names users have a well articulated set that express semantic relations between text. They have at last a meaningfull pointer which can be validated as well articulated and fed into an index.

Max Völkel (guest) - 22. Nov, 14:11

social thing? RDF thing?

Leo said: 'The social thing has to be identical to the RDF thing, so different URIs in the browser and for RDF are somewhat hindering.'

What is the an RDF thing? A resource, identified by a URIref and described with RDF properties? What is a social thing? A name with some shared understanding attached? What is a browser? A tool to render web pages for given URLs?

Given this, the social thing 'Dog' could be identified by a URI, e.g. 'http://en.wikipedia.org/wiki/Dog'. A browser will return a human-readable description about the concept Dog. Another social thing is the Wikipedia page about dogs. Again, one could suggest the URI 'http://en.wikipedia.org/wiki/Dog' for that. We can handle that, but RDF can't. So in order to have a usable, non-ambigouos RDF model, we must use different URIs for different concepts. And we should choose them with care, so that humans who indepentenly search for URIs have few nameclashed and lots of overlap. As wordnet and wikipedia are outstanding central repositories for socially constructed URLs, we would like to re-use them. That's how we came up with the idea to re-use these URIs and simply add a '#' for the concept.

danja - 20. Nov, 11:35

Book

Hi Leo,
I'm going to keep well clear of this discussion, except to say the word "crisis" may be a little strong - there's a lot can be done with only a vague notion of what URIs actually mean. Anyhow, you may want to check out this book:

http://www.aber.ac.uk/media/Documents/S4B/the_book.html

BrettspielBrowser - 21. Nov, 10:41

"Solving" the URI crisis step by step

1. First of all the SYMBOLIC level: We need symbols (URIs) to describe, annotate, comment on things. For that, I agree heavily with xamde (and of course also with jcw and leobard), that is:

We need a QUICK and EASY and SHARED way to create and use Semantic Web REPRESENTATIVES for things and concepts of our wold.

Wikipedia and Wordnet have already what it takes to DEFINE and SHARE representatives of "nearly everything" on this planet. The important thing here is: Everybody KNOWS that information source! So, using Wikipedia is much better than creating your own Wiki and representing everything there (AGAIN!) - nobody would know about your Wiki and nobody would care about it (sorry) and nobody would commit on your concepts and your definitions. The most important thing here is, that people SHARE and COMMIT on the URIs (here: URLs) you use!!

2. Now, let's talk about SEMANTICS.

That's often the point, where ontology experts and philosophers are interested in. Alas, in most cases, they are that much interested in semantics, that they kill every first working idea at first sight...

First of all, Wikipedia already delivers good semantics for HUMANS. IMHO, that is a good starting point to build up at least human-readable and -usable "data" or even "information" for the Semantic Web. We urgently need that build-up phase to awaken the Semantic Web! At least the use case of human-triggered but still lame "semantic" search should be possible as soon as possible.

3. SEMANTICS for MACHINES.

We do not want to, but we CAN wait until step three to care about MACHINE-enabled INFERENCING and high PHILOSOPHICS. The advantage of RDF is, that you can come up with multiple descriptions of one resource. So, why not deliver the FORMAL semantics of http://en.wikipedia.org/wiki/Semantic_web IN ADDITION to the english and german informal text?!

If a standardized inferencing language and service becomes available, we DO want to have all the machine-processible, formal semantics and inferences possible!! But then, we can already use the previously created RDF descriptions and relations. They won't change over night! The formal descriptions are "just" additional set of statements that are now available, too!

TimBl is right, every URI should better be a URL, too. That way, we can ALREADY use the http protocol to retrieve human-readable descriptions TODAY and that way garantee a SHARED UNDERSTANDING of the used concepts BETWEEN HUMANS.

As soon as formality and inferencing emerges, the machines can THEN also share and understand OUR concepts and resources of the human race ;-)

Have a nice day!
Sven Schwarz (BrettspielBrowser)

ossi1967 - 22. Nov, 19:29

frustrated :(

actually, although i'm really thrilled by the vision of a semantic web and try to use pieces of it every now and then, i always stop working on them when i realize that RDF gives me the grammar, but lacks the words. it's even more frustrating to see a semantic web community happily discussing the big picture and getting all excited about it, while there's little me sitting there wondering wether, within this framework, one will ever be able to unambiguously identify anything.

it seems someone's building a house here and, while noticing that there's something wrong, keeps telling us "I'll decide about the bricks later, but look at those beautiful curtains! and wait until you've seen my china!"

My personal feeling is that all went wrong when they expanded the (more or less) well defined URL into the concept of URIs. I can see the idea behind it, i know what it was supposed to be, but it simply doesn't work i'm afraid. it's too generic. and it would have been good to clearly separate the concept of a URL, a retrievable network location, from anything else.

NathalieD - 2. Jun, 23:20

I'm thrilled by the vision of a semantic web but it's a bit complicated for me:P

Nathalie Bed Bugs or Dust Mites - how to identify and deal with bed bugs

AwningsForHomes - 23. Jul, 12:31

good blog post.

good blog post. Infos are really interesting and saved me a lot time which I have spend on something else instead of googling Thanks a lot!
Window Awnings For Home

- add comment - 0 trackbacks

Trackback URL:
https://leobard.twoday.net/stories/1165470/modTrackback