It's been an exciting 1.5 months so far with Apache Stanbol in my GSoC project. I have learnt so much about how semantic content management and enhancement are done in Stanbol. Thanks to Rupert, Rafa and Andreas my mentors I have successfully passed my mid-term evaluation in GSoC 2013. This post is to give you all what I have been doing for the last 1.5 months.
Apache Stanbol is a Semantic content management management system..Enhancement Engine, Entityhub, ContentHub, Reasoners and CMS adapters are main components of Stanbol. Stanbol evolved from the 'Interactive Knowledge Stack' (IKS) research project launched to develop a technology platform for content and knowledge management in European IT organizations.
Apache Stanbol's enhancement engines are the components that actually enhance a given content by identifying entities mentioned in the content and giving entity references. When identifying entities, many semantic technologies such as natural-language-processing, part-of-speech identification and entity-linking are used. Different entities can sometimes referred by same name, same description leading to ambiguity in entity linking. This is where entity disambiguation techniques come into play. At the moment Apache Stanbol has several entity disambiguation engines, including a SolrMLT based engine and DBpedia Spotlight engine. My project aims at introducing a new disambiguation engine to use FOAF (friend-of-a-friend) data as the source and FOAF co-reference based techniques as the disambiguation technique.
At mid-term I have successfully produced 3 milestones.
1. Selection of a sufficient FOAF dataset and integrate it as a Stanbol Reference Site.
2. Configure Stanbol Entityhub engine to use the FOAF data and link entities.
3. Design a disambiguation workflow using FOAF data elements in an effective manner.
I have written down a guide to discuss : how to integrate a FOAF dataset in Apache Stanbol as a EntityhubLinking engine. Please refer my github project with all the project resources and wiki for a step-by-step guide to integrate FOAF data in Stanbol.
