Semantic Web – A Technical Introduction

Introduction

The previous document introduced Semantic Web conceptually describing what it envisions to achieve and key concepts that Semantic Web composes. This document will discuss the Semantic Web technologies from technical perspective with code snippets written in Java using Jena framework.

Jena

Jena is a java framework for building semantic web applications, it includes:
1) API for reading, processing and writing RDF data in XML, N-triples and Turtle formats;
2) API for creating & processing OWL and RDFs ontologies;
3) API for reasoning & inferencing with RDF and OWL data sets;
4) API for running SPARQL queries on RDF dataset.

RDF

RDF is a framework to model information in the form of a named graph. Every node along with all it’s links to other nodes and literals in the graph is known as Resource,  link in the graph is known  as Predicate (an attribute of the node from which it starts), which joins either two resources OR one resource and another literal. Literal can be a string, integer, float etc, in general anything that does not have an URI.

 
The above RDF model has two nodes (entities denoted by elliptical shapes), two links and one literal (entities denoted by rectangular shape). Every node and link has a URI.

Blank Node

In scenarios when information about the node is not available, it is denoted as a blank node in the RDF model. For example the information: ‘Ram has a friend whose car is Ferrari’ is modelled as:

RDFs/OWL

Ontology is a specification of a conceptualization. Just like UML is a means to visualize class definitions, relations and their hierarchy for humans, RDFs/OWL is a means to do the same but in a manner which can be understood and processed by machines. In simpler words ontology is defining vocabularies using which RDF statements are created. For example in Java the class would look like: 
Class User {
    private String firstName;
    private String secondName;
    private String secondName;
    private String email_id;
    private int age;
}
Everything in RDFs/OWL is defined as RDF statements. RDF resource for this class is defined1 as:

    
    Person
    
  
Type of this class is defined by ‘rdf:type’, URI is defined by ‘rdf:about’, ‘owl:sameas’ tells that this class is similar to the class defined in foaf2. Person defined compliant with any of the ontology, foaf OR talentica will, be considered as Person.
Each attribute of the class is also defined as a RDF resource, for example the property firstName is defined as:
Type of this resource is defined by ‘rdf:resource’ which is DatatypeProperty here which means that the value of this property is a literal. ObjectTypeProperty is used if the value of the attribute is another RDF node, ‘rdfs:range’ defines that this property belongs to the class ‘Person’, ‘rdfs:domain’ defines the type of of the attribute in this case string.
A class: ‘Employee’ that derives from class: ‘Person’ would be defined as:
    
    Employee
    
Note the xml tag ‘rdfs:subClassOf ’ which defines this relationship.
Java code using Jena APIs to define Person and Employee respectively:
Person:
   OntClass ontPerson = model.createClass(URI);

   DatatypeProperty fullName = model.createDatatypeProperty(FULLNAME);
   fullName.addDomain(ontPerson);
   fullName.addRange(XSD.xstring);
   fullName.addSameAs(VCARD.FN);
   
   DatatypeProperty firstName = model.createDatatypeProperty(FIRSTNAME);
   firstName.addDomain(ontPerson);
   firstName.addRange(XSD.xstring);
   firstName.addSameAs(VCARD.Given);
   
   DatatypeProperty secondName = model.createDatatypeProperty(SECONDNAME);
   secondName.addDomain(ontPerson);
   secondName.addRange(XSD.xstring);
   secondName.addSameAs(VCARD.Family);
   ontPerson.addLabel(label, null);
   ontPerson.addSameAs(OntologyFactory.getFoafModel().getOntClass(FOAF + label));
Employee:
OntClass ontEmployee = model.createClass(URI);
DatatypeProperty designation = model.createDatatypeProperty(DESIGATION);
designation.addDomain(ontEmployee);
designation.addRange(XSD.xstring);
ontEmployee.addLabel(label, null);
ontEmployee.addSuperClass(model.getResource(Person.URI));

Inference

Using the vocabulary and a set of RDF statements one can generate more RDF statements. RDFs/OWL has some inbuilt inference capability with the vocabulary defined by ontologies.

SameAs inference:
‘rdf:sameAs’ tag allows to define similarity of rdf resources. Reasoning engines using ontology and any RDF data can infer RDF statements about the data. Considering the below RDF models for Person using talentica ‘https://www.talentica.com/ontology’ ontology: 
 
Using foaf ‘http://xmlns.com/foaf/0.1/’ ontology: 
SInce in the Person’s ontology defined in above example ‘talentica:Person’ class is same as ‘foaf:Person’, the above two RDF models can be merged into one as:
 
 
The below code snippet shows the usage of Jena API to do the same.

Reasoner reasoner = ReasonerRegistry.getOWLReasoner();
reasoner = reasoner.bindSchema(ontology);
Model data = ModelFactory.createDefaultModel();
InfModel infmodel = ModelFactory.createInfModel(reasoner, data);

Hierarchy inference:
‘rdfs:subClassOf’ and ‘rdfs:superClassOf’ tags is used by inference engines to infer statements with a given set of statements.
Rule based Inference:
Considering the aforementioned example of blank node:

With rules as:
1.    There exists only one Ferrari in the world
2.    Car has a range of Person
3.    Friend has domain Person
The below statement can be inferred:

SPARQL

SPARQL is a query language to query RDF data stores just like what SQL does for RDMS. As RDF model contains statements, query structure is in the form of a statement. Query structure:
PREFIX: <ontology_prefix><ontology_url>
SELECT <variables> FROM <graph_name> {<subject> <predicate> <object> }.
Variables are defined beginning with question mark(?) symbol or hash (#). Consider a RDF data (person.rdf) which contains the Person data in compliance to talentica (https://www.talentica.com/ontology) ontology.
 
Find all persons in the model:
Select * FROM person.rdf where {?s ?p ?o}
Find all persons with First name as ‘Ram’:
select * FROM person.rdf where {?s tal:FN “Ram”}
Find last name of all persons:
select ?lastname FROM person.rdf where {?s tal:LN ?lastname}
String Matching
String matching lets to provide regular expressions that are used to match values. for e.g.
Find all first names containing  ‘s’
PREFIX tal: <https://www.talentica.com/ontology/Person#>

SELECT ?firstname  FROM person.rdf WHERE {
?y tal:FN ?firstname .
            FILTER regex(?firstname, "s", "i") }

Optionals

Just like nosql does not have schema RDF also does not have schema. There might be resources which does not contain a certain attribute while others might have. Optionals can be used in such scenarios. Example query:
PREFIX foaf: <http:/xmnls.com/foaf/>
SELECT ?firstname FROM person.rdf ?nickname WHERE { ?y tal:FN ?firstname .} OPTIONAL {?y foaf:nick ?nickname}

Index

1. Serialization can also be done in other formats such as N-Triples & Turtle.
2. foafis open ontology containing class definitions, their properties and relations in a social networking domain.

One thought on “Semantic Web – A Technical Introduction

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s