semantic weltbild 2.0: 30 November 2005

In this article I am going to point out how the experiences made from piggy-bank and gnowsis could fit together and how all this code could be made compatible towards a better Semantic Desktop, creating a new semantic experience. First I will point out what inspired this article and some technical details, then I will describe some plans for the future.

I saw Eric Miller from the SIMILE team at ISWC and he suggested that I and Libby and others install Piggy-Bank, which was surprising to me as I have looked on this piece of software for a long time, but never dived deep into it.

So, when I first had a look at it, I found much interesting points. Then at ISWC we tried to annotate "this talk had a bad typo in the powerpoints" and I wanted to somehow make a comment or enter this information, but not as a tag. As Piggy-Bank does not support easy entering of random RDF, I first blogged the typo and then combined the blog post URI with some RDF from a file, and it worked. So the start is, that Piggy-Bank is primarily for scraping RDF from web-pages and surfing this RDF. Editing options are only reduced and even Eric and me had some mind-bending things to do before I could get to my goal - saying the funny typo in RDF.

During the last weeks I mailed with Stefano Mazzocchi from the SIMILE team and also Apache, talking primarily about Aperture. Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (file systems, web sites, mail boxes, ...) and the file formats (documents, images, ...) occurring in these systems.

So, editing any RDF is something we have tried to do again and again in gnowsis, and we have two major approaches to this task: first, the linker, a user interface that allows linking two resources with each other and then Enquire2, a user interface with more editing options (adding wiki text, editing any RDF using the ThingDialog).

I now took a deeper look at Piggy-Bank and my mind begins thinking how we can join Piggy-Bank and Gnowsis and I decide to publish some of my thoughts, according to the good principle that I try to write on my mind: " Building the Semantic Web is easier together".

So, some facts:

Piggy-Bank offers great options to easily browse bigger amounts of RDF
Piggy-Bank is seperated into several sub-projects, like longwell for browsing.
Piggy-Bank team wants to extract more information from PDFs, etc.
Piggy-Bank combines Java with Mozilla
gnowsis extracts data from Mozilla.
gnowsis has some plugins for Mozilla Firefox and Thunderbird
Gnowsis is written in Java against Jena
Aperture is written in Java against Sesame (and planed for Jena)
Gnowsis should move to Sesame2 as data store, I already entered a ticket for this
Piggy-Bank uses Velocity for displaying their information as HTML. This is a nice framework anyways, but fattens the distribution by a few kilobyte (no problem).
both use Jetty for their internal web-servers.
Piggy-Bank stores their triples into your Firefox-Profile dir. There you find your native-SAIL sesame data. Example on my Mac: ~/Library/Application Support/Firefox/Profiles/xxxyyy.default/piggy-bank/my-piggy-bank
Gnowsis stores its data into ~/.gnowsis/data using jena

Gnowsis and Piggy-Bank are - from the high-up view - near to each other and in principle compatible. They both base on RDF, are licensed under BSD, based on Java and integrate with Mozilla. They differ much because gnowsis is programmed against Jena, but that's not a problem in principle.

So what are possible benefits and pay-offs, wild ideas:

Piggybank may profit from the thingdialog and other editing functions of gnowsis.
piggy-bank and gnowsis can both integrate aperture to extract RDF from data sources and applications like address books
gnowsis could grab much piggybank code for page-scraping
gnowsis could use longwell as RDF browsing thing
both could move on to the long-seeked "semantic Bus" architecture. more about this below. Basically, they would share their data stores together.
piggy-bank has a nice development tutorial that gives much info on how it works inside.
gnowsis has a development wiki that shows how to get started with gnowsis. and some outdated documentation that is still interesting.
gnowsis docu could be improved

but hey - what is it worth to take code from piggy-bank and try to make it run in gnowsis (and vice-versa) when both projects could live together in parallel? Wouldn't it be the right thing to make connections between semantic applications; making them all talk to each other. Together they provide a semantic desktop architecture that is open, that is based on collaboration and web communication, distributed applications working hand in hand for a better user experience.

The Semantic Bus
So to bring these two desktop applications nearer to each other, they could both use a Semantic Bus a buzzword known also as Semantic Desktop or seen on some TimBl slides recently.
Basically, a Semantic Bus is a system that connects several services running on one machine, database, gui, datasources and adapters, web-browser, applications are all conntected via this semantic bus and use snippets of RDF and some kind of REST-ful api to talk to each other.

The bus is a fabled thing that will probably take a little longer to materialize, but its part of our upcoming NEPOMUK plans anyway and we will try to implement something and then standardize and align with others. The Simile project is also aiming at the same ideas and also Patrick Stickler had and idea in the direction of a semantic desktop integration bus.

For the piggy-bank VS gnowsis question, the Semantic Bus is important for Storage and Adapters (Aperture), and I think we have to move to a more federated architecture here.

Personal Semantic Storage
At the moment, the storage of piggy-bank is far from what we want: its hidden inside an obscure directory and it can be only accessed when Firefox runs. Thats bad. Same with gnowsis, its slow because of Jena and runs only when gnowsis runs (which is often, gnowsis is designes as a server).

For our mutual goals, we would have to decide a kind of neutral "personal sesame" edition and installer, that we ship both with Piggy-Bank and gnowsis0.9. It can be a requirement for both and is a kind of "basic" architecture. And it is the first cornerstone of a Semantic Bus.

So, Personal Semantic Storage using Sesame2 as server in a small Jetty bag would serve great as first cornerstone of the Personal Semantic Desktop / Semantic Bus. *yeah* *buzzwordsgalore*

Why is this plan so good? Because it speaks the language of two major Semantic Web projects - piggy-bank and gnowsis. They both say that Sesame is a useful storage. When we keep talking about Sesame we can also say that it contains a RESTful web server and a simple web-API for it. All running stand-alone inside Jetty, if you want.

So, if piggy-bank and gnowsis could be configured to use an external personal semantic storage - sesame2Personal - we have begun.

Problems
From my perspective - what are the problems we face here

Sesame2 moves on to Java 1.5. The installation of piggy-bank is quite tricky already, under MacOsX dealing with Java 1.5 will be interesting, but solveable. If Piggy-Bank wants to use Aperture and Sesame2, they will probably have to move to Java 1.5. Gnowsis has this plans also and enough problems on itself.
can the developers of both projects gather with Aduna and design a Personal Semantic Database as first cornerstone of the Personal Semantic Desktop? The task is to find a common standard how to deploy RDF applications on desktop computers and how to name those repositories - where to put piggy-bank treasures, where to put gnowsis links, where to store all this buffered RDF from Aperture?
Adapters to local data sources. The aperture project gives easy access to local data stores, development is running at the moment. Can this be the project to integrate various local and remote data sources to the storage?

The Vision for a Semantic Desktop
So - what is the vision of a Semantic Desktop that comes here? You install your personal sesame2 server. You install Aperture and gnowsis, configure some datasources and let it crawl. You can use Piggy-Bank now and open it's faceted browsing interface on this vast datasources. Without storing anything from the web, your local gnowsis already contains around ~400.000 triples; extracted from your own files, emails, address books, bibtex files, etc. These desktop triples are ripe for piggy-bank or Autofocus.

You gather new data from the web using piggy-bank and its java-script written page scrapers. Here you can do all the nice things of piggy-bank. Store weblog's rss feeds. Tag all data related to a friend of yours. See the last hiking trip on google maps via piggy-bank. You view the stored info from your piggy-bank data using gnowsis and have the possibility to edit each literal or value. You create new instances of any class in gnowsis and store them into your personal semantic web store. For example, you want to create a new recipe in the style of epicurious. You have already piggy-banked some epicurious recipes from the web, and when you click on "new recipe" in Enquire2006 (the upcoming user interface of gnowsis0.9) you can create such a recipe for yourself. So, based on the existing recipes (a kind of ontology) you make up your new recipe and store it. Done. At the end, using piggy-bank, you could upload your recipe to a public piggy-bank and others can facet-search it there, tag it and be happy.

Also Patrick Stickler's URIQA approach would fit into this vision. Any application could use the personal semantic storage to quickly create concise bounded descriptions, or use Aperture to get fresh data from various sources.

2006 will be the year where we are closer towards contact - contact to the original use case of the semantic web. sharing cooking recipes!

Summary and Outlook
I had a look at two current Semantic Web projects, gnowsis and piggy-bank and played with ideas how they might mix together. Gnowsis and Piggy-bank have distinct features and disctinct use cases. But they both rely on storage like Sesame, and next year probably both will be based on Sesame2 native storage and lucene for indexing. So the different projects could integrate very good, if they both contact a Personal Semantic Desktop Data Server, a sesame2 thing running in the background. (have to find a name for that).

Piggy-Bank has perfected the integration of data of web-sites and gnowsis has a tradition of contacting local sources like IMAP or Microsoft Outlook, if both applications could decide on a common Sesame2 storage in the middle, and some little framework around for configuration etc, the Semantic Desktop at large would come nearer. Both might have reduced performance (adding another communication layer in the middle) but these can be overcome by pushing some of the velocity work to the server directly. I also wrote a little vision how usage of piggy-bank and gnowsis could be at the end of 2006.

leobard - 30. Nov, 21:40

1 comment - add comment

QR barcode by i-nigma.com/CreateBarcodes

Steve Dunham (guest) - 2. Dec, 20:29

I'm currently using piggy bank / semantic bank to store bookmarks, contacts, and recipes. I've written a epicurious scraper and a wikibooks recipe scraper, with more to come in the near future. (I also need add GRDDL to a rails/postgres-based recipe application that I'm developing.)

I use piggy bank to collect data, and I use a copy of Semantic Bank on a public IP as my Personal Semantic Storage. My web browser on each machine I use at home and work is linked to a semantic bank account on that server.

When I find data on the web that I want to keep, I "persist" it to the server. If I want to bookmark a page, I press "\", enter some keywords, and tell it to publish to the server.

Functionality I'd like to see in a semantic desktop application:

Ability to edit data.
Ability to enter new data (Contacts, projects, and add home pages/blogs/nicks for existing contacts).
Ability to smush contacts.
Better support for calendaring.
Support for scuttering (fetch seeAlso's, regularly fetch rss feeds, rescraping software project pages).
Ability to gather data from my web browser, sync with my address book/calendar, gather data from email.
- It'd be nice if the web browser/email app could say: this looks like a phone number, address, etc. and let you piece together a contact from that information. Or pull dates out of a airline confirmation page and let you piece together a calendar item.
- I'd eventually like to write some code that uses text classification to deconstruct any recipe on any page. (figure out what looks like an ingredient list, what looks like instructions, and let the user fixup the result.)
Have a firefox sidebar with a summary of the semantic information in the current page and a button to save the data to your personal store.

Data I want to keep in my personal semantic storage:

Recipes
Contact Info
Calendar information
Free software project information (download page, repository location, bug tracking system).
Catalog of books, cds, movies I own or have seen/read.
Relations between contacts. (I've written RDF export for OSX's address book. I'd like to upload it to my bank, and then add relationship information).
Restaurant information. (Reviews, location, etc.)
Wine/cheese/beer/etc. tasting information.
Annotations for my photo album.

I haven't looked at gnowsis yet. I got as far as the "windows" thing, and moved on. OSX and Linux tend to be my primary desktops. I'll try to get it running on OSX, if I can find time.

- add comment - 0 trackbacks

sparqling the opera community

as suggested on the swig scratchpad, we should search for examples.

ok, what we have is a SPARQL conformant query interface to 2mio triples that come from a community website (users, forums, photos, etc)

I did a little blind shots, not very amusing. By mistake, I did a full table scan on all instances once (* type *) and got a 100mb file with uris to play with. None of the uris has a DESCRIBE (argh, why isn't CBD in SPARQL yet).

so, this for example doesn't work:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
DESCRIBE <http://my.opera.com/butchevans/homes/albums/10149/PA140033.JPG>

typical uris and types I found:
http://my.opera.com/_1/xml/foaf#_1
http://my.opera.com/Fietsy/xml/foaf#Fietsy
- these are of type foaf:Person

http://my.opera.com/masterpdc/homes/albums/2158/IMG_1557.JPG
http://my.opera.com/jessegoodwin/homes/albums/2156/DSCN0322.JPG
- these are of type foaf:Image

http://my.opera.com/shawncl/albums/show.dml?id=11201
this has got type gallery:
http://my.opera.com/community/xmlns/2005/gallery#Gallery

these are blogs:
http://my.opera.com/matthewserg/blog/
type: http://purl.org/dc/elements/1.1/Collection

a document:
http://my.opera.com/olli/blog/show.dml/2235
type: http://xmlns.com/foaf/0.1/Document

groups:
http://my.opera.com/groops/xml/foaf#groops
http://my.opera.com/iran%20sayeha/xml/foaf#iran_sayeha
type: http://xmlns.com/foaf/0.1/Group

ok, looking at the opera community site we see that there is a guy named Words who is member of the week, meaning he is very active. So sparqling for Words looks like this:

PREFIX rdfs:
<http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?p ?o WHERE
{<http://my.opera.com/Words/xml/foaf#Words> ?p ?o.}

Well, this brought us his foaf:knows relations and his weblog uri. cool. Lets make a backlink search:

PREFIX rdfs:
<http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?s ?p WHERE { ?s ?p
<http://my.opera.com/Words/xml/foaf#Words>.}

this brought some pictures he created, connected via dc:creator uri: http://purl.org/dc/elements/1.1/creator.

and much more....

ok, thsi should be enough to get people on the track to build guis and web 2.0 stuff on top of opera community.

thanks to opera: the users data is users data now, not theirs. cool move.

leobard - 30. Nov, 13:43

7 comments - add comment

Kjetil Kjernsmo (guest) - 8. Dec, 15:10

dc:creator -> foaf:maker

Hey leobard and thanks for testing and the nice examples. Just a quick note: I'm rebuilding the model now to use foaf:maker instead of dc:creator, since the FOAF spec suggests that dc:creator should be used for literal names, whereas foaf:maker should be used to references to foaf:Person records, which is what we have in this case.

darrensy - 10. Sep, 03:17

I am impressed

I am impressed, I want to try this one, I visited that pages you shared and i found, its worth. Thanks for this tips. I will do the same thing.
custom wine labels

longma - 12. Oct, 05:20

pellet mill

This kind of information is very limited on internet. Nice to find the post related to my searching criteria. Your updated and informative post will be appreciated by blog loving people.
pellet mill

Amandavarley - 26. Nov, 08:09

Allen

Great Community...i want to share my blog on this here it is...
GiftBaskets.com Coupons

alec7334 - 28. Dec, 13:51

This site is interesting and fun for me. I'm glad I found out and have shared it with a few friends. Reverse Cell Phone Lookup

thestars - 16. Mar, 16:03

Good to see

Opera community must be facilitated with such kind of stuff via this specific platform. There should be some more discussions in this regard for the visitors. I do hope that you will work hard to bring more and outclass content over here. You can also visit to get info about lionhead rabbits for sale

george22 - 21. Apr, 09:25