This session will be about managing RDF data programmatically. We will set up an RDF data base (also called a triplestore). We will convert existing, non-RDF data, into RDF, programmatically, then load it to the triplestore.
These instructions assume that you are programming in Java, preferably with Eclipse, using the Apache Jena libraries. You may also use RDF4J in Java, RDFlib in Python, or Redland RDF libary in C, or dotNetRDF in C♯, or EasyRDF for PHP, or N3.js for JavaScript, or Ruby RDF for Ruby, or SWI-Prolog Semantic Web Library, etc.
These operations should get you started with Apacha Jena and Eclipse. With a different IDE for Java, the only difference will be the initial settings for a Mavan project. If you are using a different library, look at the documentation.
File -> New -> Java Project....Next >.Next >.fr.emse.master. In the Artifact's Artifact Id, write semweb. Click Finish.pom.xml. Double click on this file.pom.xml. If you used a different groupId or artifactId, change it accordingly.Now you will generate RDF data from non-RDF sources. Read the Jena tutorial to familiarise yourself with the API and learn how to generate an RDF graph programmatically. Once you are done with the tutorial, follow the instructions below.
stops.txt in the dataset you downloaded. It just describes the names and location of train stations. What is relevant is the stop_id, stop_name, stop_lat and stop_lon.rdf:type) of the class http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing, usually abbreviated as geo:SpatialThing. The WGS84 Geo Positioning vocabulary also provides RDF properties for latitude (geo:lat) and longitude (geo:long). Generate IRIs for each stops based on their stop_id.@prefix ex: <http://www.example.com/> . @prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . ex:StopArea:OCE80194035 a geo:SpatialThing; rdfs:label "gare de Neustadt (Weinstr) Hbf"@fr; geo:lat "49.35006155"^^xsd:decimal; geo:long "8.14067588"^^xsd:decimal .
There are many triplestores. The simplest to set up is probably Fuseki.
fuseki-server.bat for Windows systems, fuseki-server for Unix-based systems. Execute it. The server will be running in the background.http://localhost:3030. This interface allows you to manage your data.sncf and make it persistent.In the exercise of the first part, you can generate all the data at once in a large Jena Model and serialise it as RDF, or you can fill in a triplestore little by little. If you want to add data to a triplestore such as Jena Fuseki, you can send update queries like this:
Model model = ModelFactory.createDefaultModel();
// ... build the model
String datasetURL = "http://localhost:3030/sncf";
String sparqlEndpoint = datasetURL + "/sparql";
String sparqlUpdate = datasetURL + "/update";
String graphStore = datasetURL + "/data";
RDFConnection conn = RDFConnectionFactory.connect(sparqlEndpoint,sparqlUpdate,graphStore);
conn.load(model); // add the content of model to the triplestore
conn.update("INSERT DATA { <test> a <TestClass> }"); // add the triple to the triplestore
In this part, you will link your new dataset to Wikidata. To do that, you will query the SPARQL endpoint of Wikidata in your program. In Jena, you can send queries like this:
String wdEndpoint = "https://query.wikidata.org/sparql";
RDFConnection wdConn = RDFConnectionFactory.connect(wdEndpoint);
QueryExecution qe = wdConn.query("SELECT ?s WHERE { ?s a <TestClass> }");
ResultSet res = qe.execSelect();
Refer to the Jena tutorial on RDF connections for more details.
https://query.wikidata.org/sparql.geo:SpatialThing) you generated in previous parts. With SPARQL, compare the label of geographical entities with the label of French communes.ex:StopArea:OCE80143503 wdt:location wd:Q2833.As before, you can use store the collected information in a Jena Model (and then, serialize it in Turtle) or insert it little by little in a local triplestore.
To interact programmatically with a Linked Data Platform (as in practical session 3), you need to rely on an HTTP library in your programming language. You may use the Apache HTTP Client in Java (which is also a Jena dependency), or URLlib in Python, etc. Instead of using cURL, you can send POST requests with appropriate Turtle payloads via the programming interfaces.
Write a program that reproduces the steps of practical session 3 (Publishing data on a Linked Data Platform).