Monday, January 25, 2016

Enhancing documents in Apache ManifoldCF using Apache Stanbol

Introduction


Apache ManifoldCF is an open source document ingestion framework. Using ManifoldCF (MCF) you can;

  • Crawl documents from different content repositories such as Alfresco, Microsoft Sharepoint etc. MCF repository connectors are used for document crawling purpose. 
  • Transform the documents by adjusting document meta data and adding additional document metadata using technologies like Apache Tika. This is done using MCF transformation connectors.
  • Save source-repository security policies and document ACLs when indexing the documents by using MCF authority connectors
  • Finally index the documents in search indexes such as Apache Solr, Open Search Server or Elastic Search via output connectors

Apache ManifoldCF can be effectively used as a document ingestion engine to build federated search applications. There are several open source enterprise search solutions developed using Apache ManifoldCF as the document ingestion engine.

This post is about enhancing documents indexed by ManifoldCF by adding semantic enhancements. Semantic enhancements enrich the documents with additional contextual knowledge about real world entities which are mentioned in the document. 

Semantic enhancements may include real world entities/concepts such as people, organizations, places. By tagging documents with entities mentioned in them, documents can be connected to external knowledge bases on related entities. This concept is called linked data. Semantic data added to documents facilitate semantic search abilities to search applications where users can search documents by concepts (by a person, location or an organization etc) rather than just keywords. 

To enhance documents in ManifoldCF, we are going to use Apache Stanbol.
Apache Stanbol is a framework for semantic content management. Using Stanbol, traditional content management process can be enhanced by adding semantic knowledge by linking documents to external knowledge bases like dbpedia, freebase or even a custom developed knowledge base. Stanbol integrates components for language detection, natural language detection, named entity recognition and entity linking to extenal and custom knowledge bases. By integrating these components in an enhancement chain, Stanbol can be used to perform semantic tagging for content.

Adding semantic enhancements to documents in ManifoldCF

In this post I will explain how to enhance ManifoldCF documents using Apache Stanbol as the semantic enhancement engine. We have developed a transformation connector to ManifoldCF which connects to Apache Stanbol and enhance documents by adding entity properties as document fields to the document. 

Following is the high-level design of the Stanbol connector chain for ManifoldCF.

Figure 1 : Stanbol connector chain architecture
Prerequisites


To configure the Stanbol connector with ManifoldCF, you need to first build the connector from source and configure it in the ManifoldCF connectors.xml. Please follow below steps to get the connector configured in ManifoldCF.

1. Build ManifoldCF 2.3 from source as the connector has dependencies to mcf components.
git clone https://github.com/apache/manifoldcf.git
cd manifoldcf/
git checkout release-2.3-branch 
mvn clean install

2. Build the Apache Stanbol client which is used as a dependency in the Stanbol connector
git clone https://github.com/zaizi/apache-stanbol-client.git
cd apache-stanbol-client
git checkout jaxrs-1.0
mvn clean install -DskipTests=true

3. Checkout the source-code the Stanbol connector for ManifoldCF from the git project here: 
https://github.com/zaizi/sensefy-connectors/tree/feature/SENSEFY-1453-modify-stanbol-connector/transformation/mcf-stanbol-connector

4. Build the Stanbol connector using maven : 
mvn clean install

5. Copy the mcf-stanbol-connector-2.3-jar-with-dependencies.jar to MANIFOLDCF_INSTALL_DIR/connectors-lib

6. Configure the Stanbol connector in the connectors.xml
<transformationconnector name="Stanbol enhancer" class="org.zaizi.manifoldcf.agents.transformation.stanbol.StanbolEnhancer"/>

7. You need to run a Stanbol server to make the Stanbol connector work. You can build and start Stanbol server by following instructions in their project documentation.

Configuring a ManifoldCF Job with Stanbol connector

You need to have a repository connector and an output connector configured prior to configuring the Stanbol connector. We have configured a file repository connector and a Solr output connector for demo purpose.

Following is the ManifoldCF job configuration with 3 connectors.
  1. FileSystemRepo : File-system repository connector to ingest text documents in a folder.
  2. StanbolEnhancer : Stanbol transformation connector enhancing documents by adding semantic metadata to the document as fields
  3. solrOutput : Solr output connector to index the final documents in a Solr server
Figure 2 : ManifoldCF Job Connection

Stanbol Connector Configurations

Section 1 : Stanbol server connection configurations

In the first section of the stanbol connector configurations, you need to provide the server url and the enhancement chain name to use for enhancements.

The default values are;
  1. Stanbol server url : http://localhost:8080/
  2. Stanbol enhancement chain : default
Figure 3 : Stanbol server connection configurations


You can configure Stanbol connector to use either dereference fields or an LDPath program to define the entity properties you want to extract from entity RDF data and add to the document as semantic data. 

Section 2 : Dereference fields configurations

Dereference fields configuration of the Stanbol connector will let you define entity properties that you want to extract from the Stanbol entities and add to the manifoldCF document as fields.

Common entity properties that can be extracted from an entity are;
  • http://www.w3.org/2000/01/rdf-schema#label 
  • http://www.w3.org/2000/01/rdf-schema#comment 
  • http://www.w3.org/1999/02/22-rdf-syntax-ns#type 
These properties will vary based on the entity dataset used for entity-linking in Stanbol. It uses DBpedia dataset for the default enhancement chain.                               

Figure 4 : Dereference fields
LDPath program configurations

In this section, the user can define a LDPath program to select what properties to extract from the entities. The user needs to define the LDPath prefixes and the LDPath fields. The connector will generate the LDPath program based on the prefixes and the field definitions given, and send the enhancement request with the LDPath program to Stanbol.

In this example we have defined following LDPath prefix and field definitions.

Prefix definitions

Prefix : zaizi
Namespace URI : http://zaizi.com/custom


LDPath Field definitions

field name:  zaizi:label
definition : rdfs:label[@en] :: xsd:string

field name:  zaizi:comment
definition : rdfs:comment [@en] :: xsd:string

Figure 5 : LDPath Program configurations


Section 3 : Final document field mappings configurations

In this section, user can map the entity properties to final document fields. The same mapping can be done using a metadata-adjuster connector. We have added the mapping configurations to Stanbol connector for user's convenience.

eg : 
entity property : http://zaizi.com/custom/comment
destination field : comments

entity property : http://zaizi.com/custom/label
destination field : entity_names

The user can instead of defining field mappings, can also select to keep all entity properties as semantic fields in the document. 

Figure 6 : Final document field mappings 

Semantically enhanced Solr document

After running the ManifoldCF job with Stanbol connector, you can see the relevant semantic fields added to the final Solr document as below;

Figure 7 : Final Solr document with semantic fields

Monday, August 5, 2013

[GSoC] FOAF Co-reference based Entity Disambiguation for Apache Stanbol



It's been an exciting 1.5 months so far with Apache Stanbol in my GSoC project. I have learnt so much about how semantic content management and enhancement are done in Stanbol. Thanks to Rupert, Rafa and Andreas my mentors I have successfully passed my mid-term evaluation in GSoC 2013. This post is to give you all what I have been doing for the last 1.5 months.

Apache Stanbol is a Semantic content management management system..Enhancement Engine, Entityhub, ContentHub, Reasoners and CMS adapters are main components of Stanbol. Stanbol evolved from the 'Interactive Knowledge Stack' (IKS) research project launched to develop a technology platform for content and knowledge management in European IT organizations.

Apache Stanbol's enhancement engines are the components that actually enhance a given content by identifying entities mentioned in the content and giving entity references. When identifying entities, many semantic technologies such as natural-language-processing, part-of-speech identification and entity-linking are used. Different entities can sometimes referred by same name, same description leading to ambiguity in entity linking. This is where entity disambiguation techniques come into play. At the moment Apache Stanbol has several entity disambiguation engines, including a SolrMLT based engine and DBpedia Spotlight engine. My project aims at introducing a new disambiguation engine to use FOAF (friend-of-a-friend) data as the source and FOAF co-reference based techniques as the disambiguation technique.

At mid-term I have successfully produced 3 milestones.
1. Selection of a sufficient FOAF dataset and integrate it as a Stanbol Reference Site.
2. Configure Stanbol Entityhub engine to use the FOAF data and link entities.
3. Design a disambiguation workflow using FOAF data elements in an effective manner.

I have written down a guide to discuss : how to integrate a FOAF dataset in Apache Stanbol as a EntityhubLinking engine. Please refer my github project with all the project resources and wiki for a step-by-step guide to integrate FOAF data in Stanbol.



Monday, July 8, 2013

Research Life...

It's been a while.. :) A lot of things changed during past few months and I started a new page in my life as a researcher. I'm now a full-time research student working on my Msc by Research degree at Computer Science Department, UOM. I felt I need to have a solid research background, hence started this and hope for the best :)

My research topic is "Collaborative Reputation Management in Email Networks". The initial email architecture didn't account for reputation of the users. Maybe the inventors never thought more than 60% of email exchanged today would be spam! With time, so many research has been carried out in the  field of reputation management in email networks and associated social networks. My aim is to study such relevant projects and develop a sustainable, user-friendly and effective model for reputation management in emails. The benefits of such project will be enormous. Spam filtering, social profiles, email categorizing are few good examples. I will be writing more in coming days about the progress of my research project... Stay tuned..

By the way..
I'm also doing a Google Summer of Code project for Apache Stanbol this year. Apache Stanbol is a semantic enhancement engine for content management. It integrates so many opensource projects such as Apache Solr, Gena, Clerezza, OpenNLP under one umbrella in a component-based architecture. I find the project amazing and there are so many application areas of it. My project is about developing a "Entity Disambiguation Engine using FOAF Co-reference".  You can find my project propsal here.I will write about my adventures in this project too :)
  

Monday, March 4, 2013

Multiple P2 Profile Support in Carbon

In this blog-post I'm going to discuss about the proposed multiple P2 Profile support in Carbon. To get a better understanding of this effort, we first need to understand the basics about P2.
P2 is the provisioning framework used by WSO2 Carbon platform to install features. To get an overall understanding of P2 concepts please refer the Eclipse wiki 

The main areas to discuss here are, P2 Profile, how we generate P2 data using P2-Director and how we manage the Eclipse runtime by passing relevant configurations to start an Eclipse based product.

P2 Profile represents the state of all the installation units (IUs) that constitute an Eclipse based product. P2 Profile keeps track of all installed, uninstalled and updated IUs in the product.
In Carbon platform the IUs are features, and a Carbon product is constituted with Carbon Kernel + a set of features. 

P2 Director is the a command line tool for installing additional software or uninstalling IUs from an Eclipse-based product. This is the tool we use underneath the carbon-p2-plugin to build & provision features our products. 

Until now, Carbon products have been using a single P2 Profile:  WSO2CarbonProfile. In all our products we have been using WSO2CarbonProfile to install different features and constitute different Carbon products.
Now we will look at how we can support multiple P2 profiles in a single product distribution. This will make it possible to have multiple Profiles pre-installed in a single product and select the profile to start which will load a separate Product at runtime. 

Let me explain how we generate our product distributions using carbon-p2-plugin to have a multi-profile installation.

1. How we generate product distributions  

To see how we can generate the new structure to support multiple-profiles we should first look at how the product distributions are created currently via carbon-p2-plugin using default WSO2CarbonProfile

1. In Kernel distribution 

We first constitute the carbon product using the carbon product configuration file: carbon.product 

carbon.product file has the initial configurations of the osgi-bundles to start and other required configurations necessary to generate the config.ini (the main configuration file for an eclipse product)

It also defines the minimal equinox runtime-feature to be installed first into the default WSO2Carbon profile.



So in the first 2 phases in carbon-p2-plugin (publish-product, materialize-product) we invoke the p2-director to use above carbon.product and materialize the default WSO2CarbonProfile with a minimal runtime.

Then in the profile-gen phase we install carbon core feature to the WSO2CarbonProfile.



2. In product distributions

We extract the carbon-core distribution and install additional features to the default WSO2CarbonProfile in profile-gen phase.

So in the final output the target, the product distribution structure with a single Profile looks like this:



When supporting multiple-profiles, we have to first do the process of materializing the a Profile, before installing features to it.

This can be achieved using the same carbon.product configuration file as all our products use the same minimal runtime to install features to the Profile afterwards. 

Which means for each new P2 Profile we will have to to do the initial materializing part using carbon.product configuration file and install features on top of it.
For this, we'll have to repeat the materialize-product and p2-profile-gen goals for each Profile we generate. 
Please refer this pom in the POC done. Here in the p2-profile-gen module of the carbon product I'm creating 2 profiles. 
WSO2CarbonProfile and WSO2SampleProfile.

So the new product distribution structure now looks like below; 
Each profile installed will have a separate configuration directory, and all the Profiles will share the same P2 data-area to store P2 data.




Now let's look at how we can generate above structure using the carbon-p2-plugin. Underneath the p2-plugin we invoke the p2-director to generate above structure.
We can do this by passing below arguments to generate separate profile configuration folder (-destination) and share the same p2 data location (-shared) by all the Profiles.

Arguments to invoke P2-director;

-metadataRepository : metadataRepository.toExternalForm() 
-artifactRepository : metadataRepository.toExternalForm()
-installIU : installIUs
-profileProperties :  org.eclipse.update.install.features=true 
-destination : ${destination}/${profile}
-bundlepool : ${destination}
-shared : ${destination}/p2
-profile :  ${profile}
-roaming

To have a clear understanding of what each of above argument is used for, please refer the p2-director documentation.

eg: 
-nosplash -application org.eclipse.equinox.p2.director -metadataRepository file:/dileepa/kernel/trunk/distribution/product/modules/p2-profile-gen/target/p2-repo -artifactRepository file:/dileepa/kernel/trunk/distribution/product/modules/p2-profile-gen/target/p2-repo -profileProperties org.eclipse.update.install.features=true -installIU org.wso2.carbon.core.feature.group/4.1.0.SNAPSHOT,org.wso2.carbon.tryit.feature.group/4.1.0.SNAPSHOT,org.wso2.carbon.soaptracer.feature.group/4.1.0.SNAPSHOT,org.wso2.carbon.styles.feature.group/4.1.0.SNAPSHOT, -bundlepool /dileepa/kernel/trunk/distribution/product/modules/p2-profile-gen/target/wso2carbon-core-4.1.0-SNAPSHOT/repository/components -shared /dileepa/kernel/trunk/distribution/product/modules/p2-profile-gen/target/wso2carbon-core-4.1.0-SNAPSHOT/repository/components/p2 -destination /dileepa/kernel/trunk/distribution/product/modules/p2-profile-gen/target/wso2carbon-core-4.1.0-SNAPSHOT/repository/components/WSO2SampleProfile -profile WSO2SampleProfile -roaming

In the POC done for this, in carbon-p2-plugin I have modified the MaterialzeProductMojo and ProfileGenMojo classes by passing above arguments to the p2-director to support multiple-profile installations with a shared p2 area.

What I basically did was, changing the values for the below parameters in addArguments() method;
"-destination", destination + File.separator + profile
"-shared" , destination + File.separator + "p2"

With -destination set like above, a separate directory will be created to hold the Eclipse configuration directory for the Profile(eg: ${carbon.home}/respository/components/WSO2CarbonProfile/configuration, ${carbon.home}/respository/components/WSO2SampleProfile/configuration)

And in the separate configuration folder we will have the config.ini and org.eclipse.equinox.simpleconfigurator/bundles.info file which has the list of all the bundles to start, for this particular runtime represented by the P2 profile. 

Note: 
Since we are changing the Product distribution structure by changing install-root to point to /repository/components/${Profile}, the relative-paths for copying files during feature installation should be changed accordingly in the Platform features. We copy various configuration files during feature installations, and this this done using P2 touchpoints instructions (org.eclipse.equinox.p2.touchpoint.natives.copy touchpoint is used here).
In Carbon features we have a p2.inf file to define these instructions and the new relative paths from install directory should be changed as below;

eg:
org.eclipse.equinox.p2.touchpoint.natives.copy(source:${installFolder}/../features/org.wso2.carbon.logging.mgt.server_${feature.version}/conf/logging-config.xml,target:${installFolder}/../../conf/etc/logging-config.xml,overwrite:true);  

After making changes to features which copy files at installation and creating the product distribution like above the next thing to look at is, how we are going to launch the Equinox runtime to point to different P2 profiles, at Carbon startup.

2. How to manage the Multi-Profile Carbon runtime

Equinox finds the necessary files and directory locations to start and load the Equinox runtim using some parameters.

Some of the important parameters are below; 

1. osgi.install.area  (the root of a Eclipse product installation)
2. osgi.configuration.area (the configuration folder of an Eclipse product)
3. eclipse.p2.data.area (the P2 data area of the Profile)

So when we select a particular Profile, at startup by passing a parameter (eg: -Dprofile=WSO2SampleProfile), we need to pass above parameters based selected Profile.  Basically we need to point Equinox to the relevant Profile root and configuration directories at startup. And if a profile is not given at startup, Carbon should start with the default WSO2CarbonProfile. 
Please note that the default WSO2CarbonProfile is passed as a system property at carbon.server.Main class if no profile is given at startup. 

So the changing parameter values are
osgi.install.area : ${carbon.home}/respository/components/${profile}
osgi.configuration.area :  ${carbon.home}/respository/components/${profile}/configuration
Please see how these parameters are passed in the POC's modified carbon.server CarbonLauncher class.

So with above changes we can successfully start different Carbon profiles from a pre-installed multi-profile distribution.   

Things to improve and fix 

1. Feature installation with copying resources at runtime

The feature installation at runtime have issues with features having various files to copy, during installation. The problem is again with relative-paths to find /features directory.
If we generate profiles with -roaming enabled (which allows a Eclipse product to be moved) at runtime P2 updates the bundle-pool location to point to the same directory as the p2.installFolder.  Because of this, when we install features at runtime, /features and /plugins directories will be created inside ${carbon.home}/repository/components/${profile}

In P2 the location of the installFolder is defined by org.eclipse.equinox.p2.installFolder and the location of the bundle.pool is calculated using org.eclipse.equinox.p2.cache property. At runtime P2 engine updates the p2.cache property to point to the installFolder and the /features and /plugins are extracted there. 

eg: 
At build-time when we invoke p2-director the bundle-pool location is created at ${carbon.home}/repository/components/features
${carbon.home}/repository/components/plugins

But at runtime when we install  features it will be extracted under the installFolder as below;
${carbon.home}/repository/components/WSO2SampleProfile/features
${carbon.home}/repository/components/WSO2SampleProfile/plugins

So the same file-paths we changed in p2.inf above will be invalid and feature installation will fail during copy instruction execution.
To fix this issue, we need to either find a way to override the p2.cache location at runtime to point to the bundle-pool at ${carbon.home}/repository/components/features, /plugins
Or use some other way to derive the file-path in the p2.inf to copy files during feature installations.

2. Improve carbon-p2-plugin to a have a new goal to merge product materializing and profile-gen goals so that new Profiles can be generated in single execution without iterating the process of materializing the product and installing features afterwards.










Sunday, February 3, 2013

Running Carbon 4.0.2 in Geronimo as a web-application

In this post I'll explain how to deploy Carbon 4.0.2 as a web-application in Apache Geronimo V3. I'm using the Geronimo J2EE distribution with Tomcat 7 integration for Linux platform downloaded from here

Please Note: 
WSO2 Carbon platform is a fully fledged cloud-native middleware platform with it's own set of complexities which are greater than a typical web-application or other deployable artifact.
Running WSO2 products in web-app mode or any other deployable form in 3rd party AppServers can greatly limit the capabilities of the the platform and can introduce new complexities when combined with the host AppServer's environment and complexities.
Therefore running WSO2 Products in other AppServers is not encouraged and subject to specific support conditions only.


The Steps 

The basic steps are similar to what is explained by Pradeep in his blog on running Carbon 4.0 as a web-app in Apache Tomcat. So I would suggest reading his blog to understand the basics on running Carbon based products in web-app-mode on any other Web Server.

Here I will explain the steps to deploy Carbon web-app in Geronimo and explain what are the tweaks and changes we need to do in Geronimo to run Carbon web-app.

1. Download Carbon 4.0.2 from the previous releases page here


2. Extract carbon-4.0.2 in the file-system. 

The extracted directory will be considered $CARBON_HOME from here onwards.

3. The web-app directory for Carbon is in $CARBON_HOME/webapp-mode directory. Inside it you will find the WEB-INF folder which contains the web-application descriptors and required libraries to run Carbon web-application. The structure of the WEB-INF directory is as below;



WEB-INF/
├── eclipse
│   └── launch.ini
├── lib
│   └── org.wso2.carbon.bridge-4.0.0.jar
└── web.xml

The web.xml is the web-application descriptor that includes details about the Carbon web-app. Please note that the web-app ID will be taken as the web-context root if you don't include a geronimo-web.xml descriptor specifying the Web-Context information, under your web-application. So in this case the Carbon web-app ID is : WebApp

4. Create a .war archive file with above content in WEB-INF directory so that we can deploy it in Geronimo. I'm creating the archive as carbon.war.

eg:  jar -cvf carbon.war WEB-INF/


5. Now we need to configure the carbon-repository to be used by our carbon web-application. For this create a directory named carbon_repo and copy the repository sub-directory under $CARBON_HOME, to carbon_repo directory.

We'll call this directory $CARBON_REPO from here onwards.
For me this location is at: /home/dileepa/work/carbon_repo


 carbon_repo/
                          └── repository
                                        ├── components
                                        ├── conf
                                        ├── data
                                        ├── database
                                        ├── deployment
                                        ├── logs
                                        ├── README.txt
                                        ├── resources
                                        └── tenants

Now;
  • Remove all the tomcat related jars from the  '$CARBON_REPO/repository/components/plugins' directory.  As Geronimo provides Tomcat servlet container by default we don't need these bundles in runtime.
           org.wso2.carbon.tomcat_4.0.1.jar 
           org.wso2.carbon.tomcat.ext_4.0.2.jar
           org.wso2.carbon.tomcat.fragment.dummy_4.0.0.jar
           org.wso2.carbon.tomcat.patch_4.0.1.jar
           tomcat-ha_7.0.28.wso2v1.jar
           tomcat_7.0.28.wso2v1.jar

  • Copy the jars found under $CARBON_HOME/webapp-mode/bundles to '$CARBON_REPO/repository/components/plugins'  directory. namely org.wso2.carbon.http.bridge-4.0.0.jar and org.wso2.carbon.servletbridge-4.0.0.jar
  • Now we need to configure the bundles.info to add these 2 new bundles in the runtime. Add the below 2 lines to $CARBON_REPO/repository/components/configuration/org.eclipse.equinox.simpleconfigurator/bundles.info
org.wso2.carbon.http.bridge,4.0.0,plugins/org.wso2.carbon.http.bridge-4.0.0.jar,4,true org.wso2.carbon.servletbridge,4.0.0,plugins/org.wso2.carbon.servletbridge-4.0.0.jar,4,true

6. We now need to setup the HTTPS connector configuration for tomcat in Geronimo to handle connections to carbon web-app.

To do this; add a HTTPS Connector configuration to $GERONIMO_HOME/var/catalina/server.xml

I'm configuring a new HTTPS connector for port : 8445 in Geronimo's tomcat server using the default carbon-keystore as below;
eg:

Then configure Axis2 transports in $CARBON_REPO/repository/conf/axis2.xml

    8080

 

    8445

Now configure the serverURL and the web-context root (the web-context root for our scenario is /WebApp) in carbon.xml
/WebApp
https://localhost:8445/WebApp/services/



Also change the JMX ports in carbon.xml to some other accepted port number to avoid address conflicts with host Geronimo's RMI ports.

            
            9998
            
            11112



7. Endorsing JVM provided classes
Copy the endorsed lib jars found under $CARBON_HOME/lib/endorsed to $GERONIMO_HOME/lib/endorsed to provide the endorsed libraries required for XML parsing/API etc. 

8. Configuring the Carbon Datasource in Geronimo.
We need to configure a Carbon datasource in Geronimo for the Registry and User Manager in Carbon. For this purpose please follow the blog-post on how to create an embedded-derby Carbon datasource in Geronimo. Similarly you can create the datasource using any database type and configure the master-datasources.xml to use it.

Now use the JNDI name and the URL of the created datasource with the relevant database driver in $CARBON_REPO/repository/conf/datasources/master-datasources.xml

eg: 

            WSO2_CARBON_DB
            The datasource used for registry and user manager
            
                jca:/console.dbpool/WSO2CarbonDB/JCAConnectionManager/WSO2CarbonDB
            
            
                
                    jdbc:derby:/home/dileepa/work/geronimo-tomcat7-javaee6-web-3.0.0/var/derby/WSO2_CARBON_DB;DB_CLOSE_ON_EXIT=FALSE;LOCK_TIMEOUT=60000
                    wso2carbon
                    wso2carbon
                    org.apache.derby.jdbc.EmbeddedDriver
                    50
                    60000
                    true
                    SELECT 1
                    30000
                
            
        

To support embedded derby database connection, you  need to add the derby embedded driver bundle to $CARBON_REPO/repository/components/plugins and add below entry to bundles.info.

derby,1.0.0,plugins/derby_1.0.0.jar,4,true
You can download the derby_1.0.0 bundle from the hosted location here.

To enable Carbon Logging, you need to make the log4j.property file available for carbon logging bundle in $CARBON_REPO/repository/components/plugins. In standalone Carbon 4 based products, this is achieved by bundling the log4j property file as a fragment bundle and deploying it on the fly as from dropins. But as we are deploying the carbon runtime in bridge mode we cannot install dropins, so we have to manually copy it to $CARBON_REPO/repository/components/plugins directory and add the relevant entry in the bundles.info as below;

org.wso2.carbon.logging.propfile,1.0.0,plugins/org.wso2.carbon.logging.propfile_1.0.0.jar,4,true 
You can download the org.wso2.carbon.logging.propfile_1.0.0.jar bundle from the hosted location here.

9. Now configure $CARBON_REPO/repository/conf/user-mgt.xml and registry.xml using the above created datasource JNDI name.
  • In user-mgt.xml change the datasource property as below;
<Property name="dataSource">jca:/console.dbpool/WSO2CarbonDB/JCAConnectionManager/WSO2CarbonDB</Property>
  • In registry.xml change the dataSource as below;
 <dataSource>jca:/console.dbpool/WSO2CarbonDB/JCAConnectionManager/WSO2CarbonDB</dataSource>

10. Now the $CARBON_REPO is configured. 
You now need to set the carbon.home system property required to run Carbon. 
Set the carbon.home system property in  $GERONIMO_HOME/etc/system.properties as below;

carbon.home=/home/dileepa/work/carbon_repo

10. Now the environment is set for deploying carbon web-app in Geronimo. So start Geronimo server by executing below command;

$GERONIMO_HOME/bin$ ./geronimo run

To deploy the carbon.war we created in step 4, deploy the carbon.war in Geronimo server by going to the Deployer section in console and giving the carbon.war archive to install.


Now when deploying carbon.war if you see below error it's due to a aries-jndi bug in 0.3.0 version discussed in ARIES-554
ERROR [DataSourceRepository] Error in registering data source: WSO2_CARBON_DB - Error in creating JNDI subcontext 'javax.naming.InitialContext@2702c693/jca:: Unable to determine caller's BundleContext
org.wso2.carbon.ndatasource.common.DataSourceException: Error in creating JNDI subcontext 'javax.naming.InitialContext@2702c693/jca:: Unable to determine caller's BundleContext
 at org.wso2.carbon.ndatasource.core.DataSourceRepository.checkAndCreateJNDISubContexts(DataSourceRepository.java:232)
 at org.wso2.carbon.ndatasource.core.DataSourceRepository.registerJNDI(DataSourceRepository.java:257)
 at org.wso2.carbon.ndatasource.core.DataSourceRepository.registerDataSource(DataSourceRepository.java:360)
 at org.wso2.carbon.ndatasource.core.DataSourceRepository.addDataSource(DataSourceRepository.java:474)
 at org.wso2.carbon.ndatasource.core.DataSourceManager.initSystemDataSource(DataSourceManager.java:182)
 at org.wso2.carbon.ndatasource.core.DataSourceManager.initSystemDataSources(DataSourceManager.java:154)
 at org.wso2.carbon.ndatasource.core.internal.DataSourceServiceComponent.initSystemDataSources(DataSourceServiceComponent.java:166)
 at org.wso2.carbon.ndatasource.core.internal.DataSourceServiceComponent.setSecretCallbackHandlerService(DataSourceServiceComponent.java:152)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Geronimo V3 comes with aries-jndi-0.3.0 installed.


To fix the JNDI issue you need to update the aries-jndi bundle and it's dependencies.
For that please follow below steps;


  • Go to OSGI bundles section in console and search for aries.jndi bundle;



  • Now you need to uninstall this bundle and install the latest available org.apache.aries.jndi.1.0.0 and it's required dependencies: org.apache.aries.util.1.0.0 and org.apache.aries.proxy.1.0.0. 
  • You can download these aries bundles from the official Aries download site.After installing the 3 new aries bundles and starting them you can rectify the JNDI error you got before.
11. Now you can re-start the Geronimo server and access the Carbon management console at : https://localhost:8445/WebApp/carbon/




You can telnet to the osgi-console of Carbon server as below;
telnet localhost 19444