(1) Detailed explanation of RDF, RDFS and OWL for ontology modeling of knowledge map
1. Semantic Web System
Knowledge map was proposed by Google in 2012. It is not a new concept, but derived from semantic network. Semantic network is composed of interconnected nodes and edges. Nodes represent concepts or objects, and edges represent the relationship between them. RDF, RDFS and OWL are knowledge representation frameworks based on Semantic Web. They restrict the values of nodes and edges, and formulate unified standards, which facilitates the fusion of multi-source data. RDF and RDFS/OWL belong to semantic web technology stack. Their proposal makes the semantic web overcome the shortcomings of semantic network. The semantic web technology stack is shown below.
2.RDF expression
RDF(Resource Description Framework), that is, resource description framework, is essentially a Data Model.
Specifically,
Resource: any resource with URI identifier, such as page, picture, video, etc.
Description: the relationship between attributes, characteristics and resources.
Framework: model, language, and syntax for these descriptions.
It provides a unified standard for describing entities / resources. RDF is formally represented as SPO triples, sometimes called a statement. In the knowledge map, we also call it a subject, predicate, object. As shown in the figure below, its nodes represent entities / resources and attributes, and its edges represent the relationship between entities and entities, as well as the relationship between entities and attributes.
3.RDF serialization method
If you need to transmit and store RDF data, you need to serialize the RDF data. At present, RDF Serialization methods mainly include RDF/XML, N-Triples, Turtle, RDFa, JSON-LD, etc.
-
RDF/XML represents RDF data in XML format. This method is proposed because XML technology is relatively mature and there are many ready-made tools to store and parse XML. However, for RDF, the format of XML is too verbose and difficult to read, and we usually don't use this way to process RDF data.
-
N-Triples, that is, multiple triples are used to represent RDF data sets, which is the most intuitive representation method. In the file, each line represents a triple, which is convenient for machine parsing and processing. Open domain knowledge atlas DBpedia usually publishes data in this format.
-
Turtle should be the most used RDF serialization method. It is more compact than RDF/XML and better readable than N-Triples.
-
RDFa, namely "The Resource Deion Framework in Attributes", is an extension of HTML5. Without changing any display effect, it allows website builders to mark entities in the page, such as people, place, time, comments, etc. In other words, by embedding RDF data into web pages, search engines can better analyze unstructured pages and obtain some useful structured information. Readers can feel RDFa, which intuitively shows the pages seen by ordinary users, the pages seen by browsers and the structured information parsed by search engines.
-
JSON-LD, namely "JSON for Linking Data", stores RDF data in the form of key value pairs.
For example, the specific representations of N-Triples and Turtle of Ronaldo's knowledge map are as follows:
Example1 N-Triples: <http://www.kg.com/person/1> < http://www.kg.com/ontology/chineseName >"Ronaldo Luis nazario de Lima" ^ ^ string <http://www.kg.com/person/1> < http://www.kg.com/ontology/career >"Football player" ^ ^ string <http://www.kg.com/person/1> <http://www.kg.com/ontology/fullName> "Ronaldo Luís Nazário de Lima"^^string. <http://www.kg.com/person/1> <http://www.kg.com/ontology/birthDate> "1976-09-18"^^date. <http://www.kg.com/person/1> <http://www.kg.com/ontology/height> "180"^^int. <http://www.kg.com/person/1> <http://www.kg.com/ontology/weight> "98"^^int. <http://www.kg.com/person/1> < http://www.kg.com/ontology/nationality >"Brazil" ^ ^ string <http://www.kg.com/person/1> <http://www.kg.com/ontology/hasBirthPlace> <http://www.kg.com/place/10086>. <http://www.kg.com/place/10086> < http://www.kg.com/ontology/address >"Rio de Janeiro" ^ ^ string <http://www.kg.com/place/10086> <http://www.kg.com/ontology/coordinate> "-22.908333, -43.196389"^^string.
When represented by Turtle, Prefix is usually added to abbreviate the IRI of RDF.
Example2 Turtle: @prefix person: <http://www.kg.com/person/> . @prefix place: <http://www.kg.com/place/> . @prefix : <http://www.kg.com/ontology/> . person:1 :chineseName "Ronaldo·Louis·Nasario·virtue·Lima"^^string. person:1 :career "Football player"^^string. person:1 :fullName "Ronaldo Luís Nazário de Lima"^^string. person:1 :birthDate "1976-09-18"^^date. person:1 :height "180"^^int. person:1 :weight "98"^^int. person:1 :nationality "Brazil"^^string. person:1 :hasBirthPlace place:10086. place:10086 :address "Rio de Janeiro"^^string. place:10086 :address "-22.908333, -43.196389"^^string.
4.RDFS - supplement to RDF
RDF has limited expressive power, unable to distinguish between classes and objects, and unable to define and describe the relationships / attributes of classes. RDF is a description of specific things. It lacks abstract ability and cannot define and describe things of the same category. It needs the introduction of schema. As a supplement to RDF, RDFS, namely "Resource Deion Framework Schema", solves the dilemma of limited expression ability of RDF.
There are several important and commonly used terms in RDFS:
-
rdfs:Class. Used to define classes.
-
rdfs:domain. Used to indicate which category the attribute belongs to.
-
rdfs:range. Used to describe the value type of this attribute.
-
RDFS: subcalassof. Used to describe the parent class of this class. For example, we can define an athlete class and declare that this class is a subclass of human.
-
rdfs:subProperty. Used to describe the parent property of the property. For example, we can define a name attribute and declare that Chinese name and full name are subclasses of name.
RDFS is expressed in the same way as RDF. The common methods are RDF/XML and Turtle.
As shown below, only the RDF data is constrained at the conceptual level:
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <http://www.kg.com/ontology/> . :Person rdf:type rdfs:Class. :Place rdf:type rdfs:Class.
5.OWL - further extension of RDFS
RDF(S) can express some simple semantics, but in more complex scenarios, the semantic expression ability of RDF(S) is too weak and lacks many common features. It includes attribute definition of local value range, equivalence of class, attribute and individual, definition of disjoint class, cardinality constraint, description of attribute characteristics, etc. Therefore, W3C proposes OWL language extension RDF(S) as a recommended language for ontology representation on the semantic web. Owl can be seen as an extension of RDFS, adding additional predefined terms.
Compared with RDFS, owl introduces Boolean operators (Union, or, complement) and recursively constructs complex classes. It also provides the ability to represent existing value constraints, arbitrary value constraints and quantitative value constraints. At the same time, owl can provide description attributes with transitivity, symmetry, functionality and other properties. There are also two classes that are equivalent or disjoint, two attributes that are equivalent or inverse, two instances that are the same or different, and enumeration classes.
6.SPARQL
Relational database is generally queried with SQL. RDF data is stored in the form of graph, and its query language is SPARQL.
The core of a SPARQL query is the description of a group of variables and their relationships, forming a graph pattern with variables. Similar to SQL, a SPARQL query can return one or more results. Each result contains a binding to each of the above variables - indicating the corresponding relationship between the variable and an RDF term. RDF terms can be URI s, blank nodes, or literals. Therefore, according to each result, by replacing the variables in the query graph pattern with the corresponding RDF terms according to the binding, a sub graph of the RDF graph to be queried should be formed, which is called matching with the above graph pattern. RDF query language mainly defines a set of syntax and semantics to express the query graph pattern.