semantic weltbild 2.0

I put a newly bought audio-cd "Dendemann-Die Pfütze des Eisbergs" into my iBook drive, to listen and rip it with iTunes. That is the main application of iTunes - right?

What happens:

iTunes blocks and then hangs and fucks up. Great!
ok, pressing F12 (the key to eject cd on iBooks) does nothing. So it fucked up the OS
closing iTunes via alt+apple-esc (=ctrlaltdelete). cd still eaten
ok, this machine is stuck, I'll reboot and blog on then...
oh! rebooting doesn't work because the f*** cd driver seems to be hacked directly to the kernel or however they did it, I had to hard-reset the machine (basically, pulling the power chord)

So, after rebooting I wisely

close iTunes
put in the CD
wait till this event of putting the CD in the slot starts iTunes, then
iTunes asks me if I want to rip this CD, which I want
and then it works

great! everythings so easy today.

R U serious I have to ask - or are you intentionally f***** it up so that I have to buy the music at the iTunes music store? (which is ok, but then I suddenly think about Audiograbber and Winamp...)

leobard - 23. Oct, 20:01

0 comments - add comment

QR barcode by i-nigma.com/CreateBarcodes

- add comment - 0 trackbacks

Friday, 20. October 2006

iconography gone astray

whats that icon about?

The one to the right. You surely know, or have a great idea what courageous people will find at the end of this road...

click picture, comment there!

leobard - 20. Oct, 00:11

0 comments - add comment

- add comment - 0 trackbacks

Thursday, 19. October 2006

Talking about Semantic Desktop at ZGDV's Congress

Today I gave a talk in Darmstadt's ZGDV Institute, at the 3rd Semantic Web Congress. Hugo Kopanitsak organizes these events and managed to get an interesting round of speakers for this event.

Update: slides are for download here, Benjamin Nowack inspired me to put them online, thx.

Here is the homepage:
http://www.zgdv.de/zgdv/zgdv/Seminar/Darmstadt/Kongresse/3_SemWeb

I gave a talk about Semantic Desktop, and as I was the last speaker, I tried to keep it short because all of the previous speakers managed to sum up some minutes of delay.

The audience was filled with people from industry and government, hungry for Semantic Web. Here are two pictures of my audience:

And here is Hans-Peter Schnurr from Ontoprise, a picture I had to "gimp" up a little (a coffe cup was to the lower left and the light had to be corrected for the beamer vs Hans-Peter, luckily Sven Schwarz taught me how to do this on The Great Escape :-).

And Benjamin Nowack

Benjamin made more pics of my talk with his digicam, we will probably see them soon.

leobard - 19. Oct, 23:10

0 comments - add comment

- add comment - 0 trackbacks

Wednesday, 18. October 2006

gave a talk on Semantic Desktop for e-learning, and another tomorrow

yesterday I gave a talk on Semantic Desktop in e-learning scenarios.
At the "e-learning day der TU Kaiserslautern"

It was a short presentation and a little demo, and although I have a cold, Martin Memmel said it was a good talk. That was a nice thing to hear, because I never know if my talks are good or not. What kind of quality function can you use anyway?

He also made this photo of me:

Tomorrow I will give a talk on Semantic Desktop as such at a Semantic Web Congress at Darmstadt's ZGDV, and I am looking forward to do this because the other presenters are quite famous. One hacker you might know is Benjamin Nowack, others are CEOs of SemWeb companies in Germany like Hans-Peter Schnurr or Holger Rath, and there are many interesting speakers about applied Semantic Web.

http://www.zgdv.de/zgdv/zgdv/Seminar/Darmstadt/Kongresse/3_SemWeb

leobard - 18. Oct, 16:12

0 comments - add comment

- add comment - 0 trackbacks

Semantic Web Client Library

Recently I found the problem of embedding "dynamic" data from the semantic web to the semantic desktop, namely data that cannot be crawled efficiently.

Also, to annotate web resources in gnowsis, it is good to know as much about them as possible. A key to this vision is to respect the current best practices of publishing RDF data. Luckily Tim Berners-Lee has concentrated them alltogether in Tabulator.

And for us, we can use this by building on a library that the witty Chris Bizer, Tobias Gauß, and Richard Cyganiak did:

The Semantic Web Client Library

sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/semwebclient/

The Sematic Web Client Library represents the complete Semantic Web as a single RDF graph. The library enables applications to query this global graph using SPARQL- and find(SPO) queries. To answer queries, the library dynamically retrieves information from the Semantic Web by dereferencing HTTP URIs and by following rdfs:seeAlso links. The library is written in Java and is based on the Jena framework.

leobard - 18. Oct, 13:46

0 comments - add comment

- add comment - 0 trackbacks

Tuesday, 17. October 2006

ms dewey

a usesless search engine that looks pretty.

http://www.msdewey.com/

then again, I could just go to search for "axe feather".... and then three girls require my attention, Ms Dewey, the feather and the lady that beats me if I don't close the laptop pretty soon.

leobard - 17. Oct, 23:50

0 comments - add comment

- add comment - 0 trackbacks

PhD step1: integrating data into the semantic desktop

I will be blogging about my Semantic Web PhD for the next months, until I am finished. First, you learn what I did and do, and perhaps you can copy something for your own thesis or point me to information I missed, critique, positive and negative, is warmly welcome.

First part of my dissertation will be about integrating data into the semantic desktop. The problem at hand is, that we face data from different sources (files, e-mail, websites) and with different formats (pdf, e-mails, jpg, perhaps some RDF in a foaf file, or an iCalendar file), and these can change frequently or never. Faced with all this lovely data that can be of use for the user, we are eager to represent it as RDF. There are two main problems when transforming data to rdf:

find a RDF representation for the data (RDF(S), or OWL vocabulary)
find a URI identifying the data

I have experienced, that the second question is far harder to solve. While it is quite easy to find a RDF(S) vocabulary for e-mails, MP3s, or People (and if you don't find one on schemaweb.info, btw the only website I never had to bookmark because its so obvious, you make up the vocab yourself), finding the correct URI to identify the resource can be a longer task.

The most tricky thing is when identifying files or things crawled from a bigger database like the Thunderbird address book. For files, there are some possibilities, all of them have been used by me or others.

You can skip this section about typical URIs for files, its just an example of what implications the URI may have.

file://c:/document/leobard/documents/myfile.txt this is the easiest way, because it is comformant with all other desktop apps. Firefox will open this url, Java knows it, its good. The problems are: what if you move the file? You lose the annotations, which can be fixed. Second, the URI is not world-unique. Two people can have the same file at the same place. Also, it is not possible to use this URI in the semantic web at large, because the server misses.
http://desktop.leobard.net/~leobard/data/myfile.txt Assume you have a HTTP deamon running on your machine, like Apples OSX does, and assume you have the domain name leobard.net and register your desktop at the DNS entry desktop.leobard.net then you could host your files at this address. Using access rights, you could block all access to the files, but still open some for friends. Great. But first, people usually don't run http servers on their machines, nor do they own namespaces, nor are their desktops reachable on public IP addresses, but are rather behind NAT.
urn:file-id:243234-234234-2342342-234. Semantic Web researchers love this one. You use a hash or something else to identify the file, and then have a linking from the URI to the real location. Systems like kspaces.net used this scheme. It is ok to identify files, but looses all the nicety of URLs, that can actually locate the file also.

So, after this excursion we know that its not straightforward to identify files with a URI. We tried the first two approaches, but I am not happy with them, perhaps I blog the latest findings regarding URIs sometimes.

On with metadata integration. So, four years ago I needed a way to extract metadata from MP3s, Microsoft Outlook and other files. I created something called "File Adapters". They worked very elegant: you post a query for " ?x" and get the answer "Numb". This was done by analysing the subject URI (file://...) and then invoking the right adapter. The adapter looked at the predicate and extracted only that, very neat. BUT after two years, around 2004 I realised that I need an index of all data anyway to do cool SPARQL queries, because the question "?x mp3:artist 'U2'" was not possible - for such queries, you need a central index like Google Desktop or Mac's Quicksilver (ahh, I mean Spotlight) does. For this, the Adapters are still useable, because they can extract the triples bit by bit. But then, if you do it by crawling anyway, then you could simplify the whole thing drastically. Thats what we found out the hard way by implementing it and seeing that interested students that helped out had many problems with the complicated adapter stuff, but are quite quick writing crawlers. We have written this in a paper called "Leo Sauermann, Sven Schwarz: Gnowsis Adapter Framework: Treating Structured Data Sources as Virtual RDF Graphs. In Proceedings of the ISWC 2005." (bibtex here). Shortly after finishing this paper (may 2005?), I came to the conclusion that writing these crawlers is a problem that many other people have, so I asked the people from x-friend if they would want to do this together with me, but they didn't answer. I then contacted the Aduna people, who do Autofocus, and, even better for us, they agreed to cooperate on writing adapters and suggested to call the project Aperture. We looked at what we did before and then merged our approaches, basically using the code Aduna had before and putting it into Aperture.

What we have now is an experiment that showed me that accessing the data live was slower and more complicated than using an index, and the easiest way to fill the index is crawling.

The problem that is still unsolved is, that the Semantic Web is not designed to be crawled. It should consist of distributed information sources, that are accessed through technologies like SPARQL. So, at one point in the future we will have to rethink what information should be crawled and what not, because it is already available as SPARQL endpoint. And then, it is going to be tricky to distribute the SPARQL queries amongst many such endpoints, but that will surely be solved by our Semantic Web services friends.

leobard - 17. Oct, 21:03

5 comments - add comment

Bruce D'Arcus (guest) - 17. Oct, 23:06

identifying files

Richard Cyganiak (guest) - 18. Oct, 14:17

Magnet URIs

The hash-based URN thing reminded me of Magnet URIs.

Bernhard (guest) - 23. Oct, 10:06

it does not matter

I tend to prefer the third option you posted. I believe that it is important to strictly separate the identification (i.e. the URI) and all descriptive metadata. The question of mapping this URI to "real" filenames can be formulated again as RDF metadata by formulating statements like e.g. "urn:file-id:blahblah mapping:incarnates http://myhost/some/file" stating that this file can be retrieved via HTTP.

leobard - 29. Oct, 12:49

and it does matter

Your proposed solution doesn't work because you cannot automatically detect where to find the resources that are the identifiers for the HTTP file:

* how to identify http://myhost/some/file - how to get from the HTTP URI to the URI urn:file-id:blahblah?

so, Bernhard, welcome to the URI crisis.

george22 - 1. Dec, 08:09

I was interested in how to identify files too as part of my work on the OpenDoucment metadata subcommittee. One of the nice things I was thinking about using a urn (say uuid) was that you could then do cool stuff like have editors automatically add properties that allows tracking of isVersionOf and isPartOf relations.warmest winter jacket winter cycling jacket trench coat dress trench coat with hood cheap trench coats waterproof trench coat military leather jackets full length leather coat

- add comment - 1 trackback

older stories