piggybank VS gnowsis - towards a better Semantic Desktop
In this article I am going to point out how the experiences made from piggy-bank and gnowsis could fit together and how all this code could be made compatible towards a better Semantic Desktop, creating a new semantic experience. First I will point out what inspired this article and some technical details, then I will describe some plans for the future.
I saw Eric Miller from the SIMILE team at ISWC and he suggested that I and Libby and others install Piggy-Bank, which was surprising to me as I have looked on this piece of software for a long time, but never dived deep into it.
So, when I first had a look at it, I found much interesting points. Then at ISWC we tried to annotate "this talk had a bad typo in the powerpoints" and I wanted to somehow make a comment or enter this information, but not as a tag. As Piggy-Bank does not support easy entering of random RDF, I first blogged the typo and then combined the blog post URI with some RDF from a file, and it worked. So the start is, that Piggy-Bank is primarily for scraping RDF from web-pages and surfing this RDF. Editing options are only reduced and even Eric and me had some mind-bending things to do before I could get to my goal - saying the funny typo in RDF.
During the last weeks I mailed with Stefano Mazzocchi from the SIMILE team and also Apache, talking primarily about Aperture. Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (file systems, web sites, mail boxes, ...) and the file formats (documents, images, ...) occurring in these systems.
So, editing any RDF is something we have tried to do again and again in gnowsis, and we have two major approaches to this task: first, the linker, a user interface that allows linking two resources with each other and then Enquire2, a user interface with more editing options (adding wiki text, editing any RDF using the ThingDialog).
I now took a deeper look at Piggy-Bank and my mind begins thinking how we can join Piggy-Bank and Gnowsis and I decide to publish some of my thoughts, according to the good principle that I try to write on my mind: " Building the Semantic Web is easier together".
So, some facts:
So what are possible benefits and pay-offs, wild ideas:
The Semantic Bus
So to bring these two desktop applications nearer to each other, they could both use a Semantic Bus a buzzword known also as Semantic Desktop or seen on some TimBl slides recently.
Basically, a Semantic Bus is a system that connects several services running on one machine, database, gui, datasources and adapters, web-browser, applications are all conntected via this semantic bus and use snippets of RDF and some kind of REST-ful api to talk to each other.
The bus is a fabled thing that will probably take a little longer to materialize, but its part of our upcoming NEPOMUK plans anyway and we will try to implement something and then standardize and align with others. The Simile project is also aiming at the same ideas and also Patrick Stickler had and idea in the direction of a semantic desktop integration bus.
For the piggy-bank VS gnowsis question, the Semantic Bus is important for Storage and Adapters (Aperture), and I think we have to move to a more federated architecture here.
Personal Semantic Storage
At the moment, the storage of piggy-bank is far from what we want: its hidden inside an obscure directory and it can be only accessed when Firefox runs. Thats bad. Same with gnowsis, its slow because of Jena and runs only when gnowsis runs (which is often, gnowsis is designes as a server).
For our mutual goals, we would have to decide a kind of neutral "personal sesame" edition and installer, that we ship both with Piggy-Bank and gnowsis0.9. It can be a requirement for both and is a kind of "basic" architecture. And it is the first cornerstone of a Semantic Bus.
So, Personal Semantic Storage using Sesame2 as server in a small Jetty bag would serve great as first cornerstone of the Personal Semantic Desktop / Semantic Bus. *yeah* *buzzwordsgalore*
Why is this plan so good? Because it speaks the language of two major Semantic Web projects - piggy-bank and gnowsis. They both say that Sesame is a useful storage. When we keep talking about Sesame we can also say that it contains a RESTful web server and a simple web-API for it. All running stand-alone inside Jetty, if you want.
So, if piggy-bank and gnowsis could be configured to use an external personal semantic storage - sesame2Personal - we have begun.
Problems
From my perspective - what are the problems we face here
So - what is the vision of a Semantic Desktop that comes here? You install your personal sesame2 server. You install Aperture and gnowsis, configure some datasources and let it crawl. You can use Piggy-Bank now and open it's faceted browsing interface on this vast datasources. Without storing anything from the web, your local gnowsis already contains around ~400.000 triples; extracted from your own files, emails, address books, bibtex files, etc. These desktop triples are ripe for piggy-bank or Autofocus.
You gather new data from the web using piggy-bank and its java-script written page scrapers. Here you can do all the nice things of piggy-bank. Store weblog's rss feeds. Tag all data related to a friend of yours. See the last hiking trip on google maps via piggy-bank. You view the stored info from your piggy-bank data using gnowsis and have the possibility to edit each literal or value. You create new instances of any class in gnowsis and store them into your personal semantic web store. For example, you want to create a new recipe in the style of epicurious. You have already piggy-banked some epicurious recipes from the web, and when you click on "new recipe" in Enquire2006 (the upcoming user interface of gnowsis0.9) you can create such a recipe for yourself. So, based on the existing recipes (a kind of ontology) you make up your new recipe and store it. Done. At the end, using piggy-bank, you could upload your recipe to a public piggy-bank and others can facet-search it there, tag it and be happy.
Also Patrick Stickler's URIQA approach would fit into this vision. Any application could use the personal semantic storage to quickly create concise bounded descriptions, or use Aperture to get fresh data from various sources.
2006 will be the year where we are closer towards contact - contact to the original use case of the semantic web. sharing cooking recipes!
Summary and Outlook
I had a look at two current Semantic Web projects, gnowsis and piggy-bank and played with ideas how they might mix together. Gnowsis and Piggy-bank have distinct features and disctinct use cases. But they both rely on storage like Sesame, and next year probably both will be based on Sesame2 native storage and lucene for indexing. So the different projects could integrate very good, if they both contact a Personal Semantic Desktop Data Server, a sesame2 thing running in the background. (have to find a name for that).
Piggy-Bank has perfected the integration of data of web-sites and gnowsis has a tradition of contacting local sources like IMAP or Microsoft Outlook, if both applications could decide on a common Sesame2 storage in the middle, and some little framework around for configuration etc, the Semantic Desktop at large would come nearer. Both might have reduced performance (adding another communication layer in the middle) but these can be overcome by pushing some of the velocity work to the server directly. I also wrote a little vision how usage of piggy-bank and gnowsis could be at the end of 2006.
I saw Eric Miller from the SIMILE team at ISWC and he suggested that I and Libby and others install Piggy-Bank, which was surprising to me as I have looked on this piece of software for a long time, but never dived deep into it.
So, when I first had a look at it, I found much interesting points. Then at ISWC we tried to annotate "this talk had a bad typo in the powerpoints" and I wanted to somehow make a comment or enter this information, but not as a tag. As Piggy-Bank does not support easy entering of random RDF, I first blogged the typo and then combined the blog post URI with some RDF from a file, and it worked. So the start is, that Piggy-Bank is primarily for scraping RDF from web-pages and surfing this RDF. Editing options are only reduced and even Eric and me had some mind-bending things to do before I could get to my goal - saying the funny typo in RDF.
During the last weeks I mailed with Stefano Mazzocchi from the SIMILE team and also Apache, talking primarily about Aperture. Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (file systems, web sites, mail boxes, ...) and the file formats (documents, images, ...) occurring in these systems.
So, editing any RDF is something we have tried to do again and again in gnowsis, and we have two major approaches to this task: first, the linker, a user interface that allows linking two resources with each other and then Enquire2, a user interface with more editing options (adding wiki text, editing any RDF using the ThingDialog).
I now took a deeper look at Piggy-Bank and my mind begins thinking how we can join Piggy-Bank and Gnowsis and I decide to publish some of my thoughts, according to the good principle that I try to write on my mind: " Building the Semantic Web is easier together".
So, some facts:
- Piggy-Bank offers great options to easily browse bigger amounts of RDF
- Piggy-Bank is seperated into several sub-projects, like longwell for browsing.
- Piggy-Bank team wants to extract more information from PDFs, etc.
- Piggy-Bank combines Java with Mozilla
- gnowsis extracts data from Mozilla.
- gnowsis has some plugins for Mozilla Firefox and Thunderbird
- Gnowsis is written in Java against Jena
- Aperture is written in Java against Sesame (and planed for Jena)
- Gnowsis should move to Sesame2 as data store, I already entered a ticket for this
- Piggy-Bank uses Velocity for displaying their information as HTML. This is a nice framework anyways, but fattens the distribution by a few kilobyte (no problem).
- both use Jetty for their internal web-servers.
- Piggy-Bank stores their triples into your Firefox-Profile dir. There you find your native-SAIL sesame data. Example on my Mac: ~/Library/Application Support/Firefox/Profiles/xxxyyy.default/piggy-bank/my-piggy-bank
- Gnowsis stores its data into ~/.gnowsis/data using jena
So what are possible benefits and pay-offs, wild ideas:
- Piggybank may profit from the thingdialog and other editing functions of gnowsis.
- piggy-bank and gnowsis can both integrate aperture to extract RDF from data sources and applications like address books
- gnowsis could grab much piggybank code for page-scraping
- gnowsis could use longwell as RDF browsing thing
- both could move on to the long-seeked "semantic Bus" architecture. more about this below. Basically, they would share their data stores together.
- piggy-bank has a nice development tutorial that gives much info on how it works inside.
- gnowsis has a development wiki that shows how to get started with gnowsis. and some outdated documentation that is still interesting.
- gnowsis docu could be improved
The Semantic Bus
So to bring these two desktop applications nearer to each other, they could both use a Semantic Bus a buzzword known also as Semantic Desktop or seen on some TimBl slides recently.
Basically, a Semantic Bus is a system that connects several services running on one machine, database, gui, datasources and adapters, web-browser, applications are all conntected via this semantic bus and use snippets of RDF and some kind of REST-ful api to talk to each other.
The bus is a fabled thing that will probably take a little longer to materialize, but its part of our upcoming NEPOMUK plans anyway and we will try to implement something and then standardize and align with others. The Simile project is also aiming at the same ideas and also Patrick Stickler had and idea in the direction of a semantic desktop integration bus.
For the piggy-bank VS gnowsis question, the Semantic Bus is important for Storage and Adapters (Aperture), and I think we have to move to a more federated architecture here.
Personal Semantic Storage
At the moment, the storage of piggy-bank is far from what we want: its hidden inside an obscure directory and it can be only accessed when Firefox runs. Thats bad. Same with gnowsis, its slow because of Jena and runs only when gnowsis runs (which is often, gnowsis is designes as a server).
For our mutual goals, we would have to decide a kind of neutral "personal sesame" edition and installer, that we ship both with Piggy-Bank and gnowsis0.9. It can be a requirement for both and is a kind of "basic" architecture. And it is the first cornerstone of a Semantic Bus.
So, Personal Semantic Storage using Sesame2 as server in a small Jetty bag would serve great as first cornerstone of the Personal Semantic Desktop / Semantic Bus. *yeah* *buzzwordsgalore*
Why is this plan so good? Because it speaks the language of two major Semantic Web projects - piggy-bank and gnowsis. They both say that Sesame is a useful storage. When we keep talking about Sesame we can also say that it contains a RESTful web server and a simple web-API for it. All running stand-alone inside Jetty, if you want.
So, if piggy-bank and gnowsis could be configured to use an external personal semantic storage - sesame2Personal - we have begun.
Problems
From my perspective - what are the problems we face here
- Sesame2 moves on to Java 1.5. The installation of piggy-bank is quite tricky already, under MacOsX dealing with Java 1.5 will be interesting, but solveable. If Piggy-Bank wants to use Aperture and Sesame2, they will probably have to move to Java 1.5. Gnowsis has this plans also and enough problems on itself.
- can the developers of both projects gather with Aduna and design a Personal Semantic Database as first cornerstone of the Personal Semantic Desktop? The task is to find a common standard how to deploy RDF applications on desktop computers and how to name those repositories - where to put piggy-bank treasures, where to put gnowsis links, where to store all this buffered RDF from Aperture?
- Adapters to local data sources. The aperture project gives easy access to local data stores, development is running at the moment. Can this be the project to integrate various local and remote data sources to the storage?
So - what is the vision of a Semantic Desktop that comes here? You install your personal sesame2 server. You install Aperture and gnowsis, configure some datasources and let it crawl. You can use Piggy-Bank now and open it's faceted browsing interface on this vast datasources. Without storing anything from the web, your local gnowsis already contains around ~400.000 triples; extracted from your own files, emails, address books, bibtex files, etc. These desktop triples are ripe for piggy-bank or Autofocus.
You gather new data from the web using piggy-bank and its java-script written page scrapers. Here you can do all the nice things of piggy-bank. Store weblog's rss feeds. Tag all data related to a friend of yours. See the last hiking trip on google maps via piggy-bank. You view the stored info from your piggy-bank data using gnowsis and have the possibility to edit each literal or value. You create new instances of any class in gnowsis and store them into your personal semantic web store. For example, you want to create a new recipe in the style of epicurious. You have already piggy-banked some epicurious recipes from the web, and when you click on "new recipe" in Enquire2006 (the upcoming user interface of gnowsis0.9) you can create such a recipe for yourself. So, based on the existing recipes (a kind of ontology) you make up your new recipe and store it. Done. At the end, using piggy-bank, you could upload your recipe to a public piggy-bank and others can facet-search it there, tag it and be happy.
Also Patrick Stickler's URIQA approach would fit into this vision. Any application could use the personal semantic storage to quickly create concise bounded descriptions, or use Aperture to get fresh data from various sources.
2006 will be the year where we are closer towards contact - contact to the original use case of the semantic web. sharing cooking recipes!
Summary and Outlook
I had a look at two current Semantic Web projects, gnowsis and piggy-bank and played with ideas how they might mix together. Gnowsis and Piggy-bank have distinct features and disctinct use cases. But they both rely on storage like Sesame, and next year probably both will be based on Sesame2 native storage and lucene for indexing. So the different projects could integrate very good, if they both contact a Personal Semantic Desktop Data Server, a sesame2 thing running in the background. (have to find a name for that).
Piggy-Bank has perfected the integration of data of web-sites and gnowsis has a tradition of contacting local sources like IMAP or Microsoft Outlook, if both applications could decide on a common Sesame2 storage in the middle, and some little framework around for configuration etc, the Semantic Desktop at large would come nearer. Both might have reduced performance (adding another communication layer in the middle) but these can be overcome by pushing some of the velocity work to the server directly. I also wrote a little vision how usage of piggy-bank and gnowsis could be at the end of 2006.
leobard - 30. Nov, 21:40
|
- add comment - 0 trackbacks
I use piggy bank to collect data, and I use a copy of Semantic Bank on a public IP as my Personal Semantic Storage. My web browser on each machine I use at home and work is linked to a semantic bank account on that server.
When I find data on the web that I want to keep, I "persist" it to the server. If I want to bookmark a page, I press "\", enter some keywords, and tell it to publish to the server.
Functionality I'd like to see in a semantic desktop application: