What is WoDQA (Web of Data Query Analyzer)
WoDQA is a federated query based tool to execute queries over the Web of Data. This tool is benefited from VoID approach to gain information about datasets and links between them. By means of this powerful dataset description ability of VOID, WoDQA can discover all possible related datasets for a query and it transforms initial queries to federated queries which will be executed over distributed datasets. Thus queries can execute distributely without a knowledge about where they will be queried.You can try this tool from the following link:WoDQA SPARQL Form
Releases of WoDQA
Released! (05/07/2012) New version of WoDQA Query Engine has been released. In this version, analysis performance has been improved. ASK queries are incorporated into the analysis process. An initial version of query optimizer which includes heuristic optimization has been implemented.
Released! (10/04/2012) New version of WoDQA Query Engine has been released. In this version, simple queries can be transformed into federated queries by using available VOID documents and they are executed over the datasets defined by VOID documents.
Released! (10/04/2013) New version of WoDQA Query Engine has been released. In this version, rules are revised, analysis performance is improved, and reorganizing query is optimized as causing to the minimum execution cost. Besides ARQ is extended and bound join is implemented for the execution phase instead of nested loop join method of ARQ.
Documentation of WoDQA
1) WoDQA Internal Architecture
WoDQA conists of 3 main modules
Dataset Analyzer: It analyzes all triple patterns in a query by dataset and linkset descriptions of all available VoID documents. Thus it eliminates each irrelevant datasets for triple patterns.
Query Reorganizer: This module rewrites an initial query in a federated form by using analyzer results and by optimizing the query. So, queried dataset(s) of every triple patters are explicitly specified.
WoDQA Engine: Jena ARQ query engine constitues this part of tool. It provides sending subqueries to endpoints of datasets and merging all query results received from datasets.
2.1) Adding MAVEN dependency
WoDQA can be used by adding code below into your pom.xml file:
<dependencies> <dependency> <groupId>Seagent</groupId> <artifactId>wodqa</artifactId> <version>0.0.3-20130410</version> </dependency> </dependencies> <repositories> <repository> <id>seagent</id> <name>Seagent Repository</name> <url>http://seagent.ege.edu.tr/etmen/snapshots</url> </repository> </repositories>
2.2) Constructing VOID Documents
For using WoDQA, VOID documents are required to analysis. VOID means Vocabulary of Interlinked Datasets, so WoDQA knows datasets and relationships between them using VOID documents. Each dataset is described with a VOID document. In VOID documents, each dataset has basically properties below:
- void:vocabulary: It describes predicates and classes used in a dataset.
- void:urispace: It describes uri spaces that resources of a dataset starts with.
- void:sparqlEndpoint: It describes web endpoint address which a dataset can be queried from.
- void:linkset: It describes links (relationships) between a dataset itself and other datasets.
Sample VOID description for LinkedMDB dataset in turtle format could be described as follows:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
<http://example.org/dataset/linkedMdb> void:vocabulary "http://data.linkedmdb.org/resource/movie/" ;
< http://example.org/dataset/linkedMdb> void:subjectsTarget < http://example.org/dataset/linkedMdb> ;
Queries can be executed over the WoDQA engine simply. Just raw query (QUERY), a model that contains all VOID information (voidModel) and ask optimization flag (askOptimization) are given to this module and query execution is received. Then you can continue the execution according to the type of the initial query.
// execute query using wodqa engine module
QueryExecution execution = new WodqaEngine().prepareToExecute(voidModel,QUERY,askOptimization);
// if query is a select query
// if query is a construct query
If you want to use nested loop join instead of bound join, then use the constructor with boolean nestedLoop flag described below:
Also only federated query could be get by WoDQA engine without creating query execution.
String federatedQuery = new WodqaEngine().federateQuery(voidModel,QUERY,askOptimization);
3) Query Example Execution Project
You can download the example project and import into your eclipse workspace to see the query execution steps of WoDQA. The query asks for the neareast airports to Edinburgh. WoDQA distributes the query to DBpedia and LinkedGeoData whose VOID documents are prepared and given to WoDQA in the example.
Developers of WoDQA
Tayfun Gökmen Halaç, Ziya Akar