This session is about publishing structured data on the Web.
In this part, you will use a command line tool called cURL. It is a tool that every computer scientist should have and know how to use. If you are already familiar with cURL, you can jump to the next section.
If you do not have it already, download cURL and put it in a folder you will remember. On a Linux OS, cURL is available as a package in most distributions. Use your distribution's package manager (such as apt or brew) to get it.
You may need to update the PATH variable in your system environment configuration. On MS Windows, you can use Window-key + R, then type SystemPropertiesAdvanced. Then click Environment variables...
. Then find the variable Path or PATH in the user or system variables. Edit it and add the path to the folder you used to put cURL.
We will first learn the basics of cURL, then use it to understand how Linked Data principles and best practices are implemented.
http://mines-stetienne.fr in the address bar. Notice what is happening. We are going to compare this to what cURL does.curl -V to check that cURL is working. If not, go back to the previous steps.curl http://mines-stetienne.fr and look at the result. cURL displays the payload (that is, the “body”) of the HTTP response. In this case, it is an HTML document saying that the document was moved to https://mines-stetienne.fr.curl https://mines-stetienne.fr and look at the result. It should be empty. We need to figure out what is happening.curl -I https://mines-stetienne.fr and look at the result. -I asks to display only the HTTP HEAD of the response, not the payload. We see that the resource with URI https://mines-stetienne.fr was found at another location and we see the location where we can find it.curl https://www.mines-stetienne.fr/ and look at the result.. This time, you get a web page. This is the HTML code of the page you see in your browser.The HTTP response codes 301 Moved Permanently and 302 Found are commonly called “redirects”. Your browser directly displays the Web page because it is “following” the redirects. You can check that the URL in the address bar of your browser is https://www.mines-stetienne.fr/. The browser stops redirecting when it finds a 200 OK: it means that the resource you requested (namely, https://www.mines-stetienne.fr/) has been found and is this file, which is an information resource.
Information resources are text documents, images, hypermedia documents or any other resource that can be fully represented in a digital format. Non-information resources, such as humans, physical objects and abstract concepts, cannot have a direct representation on the Web, even though they can be identified by a URL. Redirection is an important mechanism to find information (through a 200 OK response) about non-information resources.
You can follow redirects with cURL, using the option -L. Check this: curl -L http://mines-stetienne.fr. You can also see what the server is responding at each step of the negotiation by adding -I. You can get even more details about the requests and responses by further adding -v or --verbose.
We will use cURL on DBpedia to see how Linked Data can be accessed via HTTP.
http://dbpedia.org/resource/Tim_Berners-Lee. This operation is called dereferencing. Use cURL and see what URIs must be requested, in order, to reach a final representation. It is possible that you have to use the option -k when requesting https URIs, depending on your system configuration. Take note of all the URIs in a text file that you will send at the end of the session.curl -H "Accept: text/turtle" http://dbpedia.org/resource/Tim_Berners-Lee. Use -H "Accept: text/turtle" on all necessary requests to reach a 200 OK and get some data. Write the list of URIs that have been requested along the way.http://dbpedia.org/resource/Tim_Berners-Lee and write it in your answer file.If a server adheres to the fourth Linked Data principle, it should include hyperlinks from one data set to another, containing at least the URI of that (unknown) data set and, optionally, a relation type to characterize the link. Every RDF triple whose subject and object are URI nodes is in fact a hyperlink, which is why RDF is the default Linked Data format.
A tool that can help you navigate through RDF data is RDF Browser, a Firefox extension that shows RDF in the browser whenever it is available by content negotiation. If you have Firefox, you can install and try this extension.
As an alternative, you can also use Postman. Postman is comparable to cURL, except that it has a graphical interface that facilitates navigation (among other things). All links that appear in a response body will be clickable. You will have to manually add Accept headers for content negotiation, though.
In DBpedia, Mines Saint-Étienne's data are linked to many other data sets. Using the RDF Browser, Postman or cURL (in this order of preference), find a path starting from https://dbpedia.org/resource/%C3%89cole_nationale_sup%C3%A9rieure_des_mines_de_Saint-%C3%89tienne and leading to Jean Monnet University. Hint: the shortest path between the two goes through the DBpedia data set about Saint-Étienne (the city). In total, how many HTTP requests did you have to send? Write down the answer in the text file.
Your goal in this part is to publish the RDF data you have modeled in previous assignments. You will publish it on a platform, following Linked Data principles, such that you will be able to link your data to other people's data. The prerequisite for this part is that you have an RDF graph written in some Turtle file. This file will be referred to as data.ttl in the following; adapt instructions if needed.
To communicate with the platform, you will need cURL or Postman. The main cURL options you should know are the following:
-X |
to indicate the request method, i.e. GET, POST, PUT or DELETE, |
-H |
to add headers, such as Accept, to the request |
-i |
to inspect headers included in the response |
--data-binary @<filename> |
to specify the body of the request (beware of the 'double dash' and 'at' characters) |
-u <username>:<password> |
to specify the username and password for websites that are password-protected (replace <username> and <password> with the username and password given in class) |
Postman's interface should be intuitive enough for those who have used cURL once. Once your communication tool is set up, you are ready to publish your RDF data.
https://semweb-emse.solidcommunity.net/public/sandbox/. Make sure you obtain an RDF representation and not an HTML representation.
The resource you are seeing is a container; it contains other resources, in the same way a directory contains other files. Note the ldp:contains predicate.
This container follows the Linked Data Platform (LDP) standard. After having read section 1 of the LDP primer, look at what resources are contained in the container. List them in the text file.
To add resources to a container, the LDP standard prescribes to send a POST request to it.
The body of the request must include an RDF graph, in one of the known RDF formats (e.g. Turtle).
Optionally, you can suggest a name for the created resource by adding a Slug header (use Slug: <your name>-data in today's assignment).
Use cURL to add data.ttl to the container.
Make sure your request includes the appropriate headers: Content-Type: text/turtle and (optionally) Slug: <your name>-data.
Then, look at the response status code.
201, the operation has succeeded.
The exact URI of the created resource is included in the server's response via the Location header. Write in the text file the URI of your new resource.
400, your request was not appropriate.
Try with different arguments.
The server provides you with an error message that may help you understand the problem.
500, something went wrong on the server’s side.
Use your acting skills to get the lecturer’s attention.
Navigate back to the resource container: https://semweb-emse.solidcommunity.net/public/sandbox/. You should now see a link pointing to the resource you created (as an RDF triple with the ldp:contains predicate).
The LDP standard also specifies how to update and remove a resource from a container with PUT and DELETE requests, respectively.
The body of the PUT request must include an RDF graph.
The existing RDF representation of the resource will then be replaced by the provided RDF graph.
The DELETE request has no body.
Remove the resource you have created and navigate back to the resource container. What effect did the deletion have on the container? Write down the answer in the text file.
Containers can contain other containers, as would folders in any file system. To create a container, a Link header should be added to the POST request to tell the server the desired type for the created resource. For the simplest type of containers, your cURL command should include -H "Link: <http://www.w3.org/ns/ldp#BasicContainer>; rel=\"type\"" (note that quotes within the header field must be escaped). When creating a container, the body of your request can be empty but you should still provide a Content-Type header. Create your own container with the URI https://semweb-emse.solidcommunity.net/public/sandbox/<your name>. Then, navigate to the created resource and inspect its RDF representation. You should see that the server automatically added some triples. Write these triples in the text file.
In the following exercises, you will create resources inside the container you have just created instead of the parent container (/public/sandbox).
Linked Data allows anyone to publish data linking to your own data, as explained in the LDP primer. Your data should include an entity for the SemWeb lecture and another entity for yourself. Can your classmates unambiguously identify these two entities with URIs? (If not, reconsider the way you identified entities in your data).
When discovering information about you, other Web agents might not want to have information about the SemWeb lectures. It is common practice to describe only one entity per document. Split data.ttl into several Turtle files: for each entity described in data.ttl, there should be a file <entity>.ttl that includes triples whose subject is the entity. Make sure the entities you describe are under the namespace <#> (you may change the prefix, though).
Publish each of the files you've created on the platform. For each file <entity.ttl>, you should create the resource https://semweb-emse.solidcommunity.net/public/sandbox/<your name>/<entity>. Make sure you're not using the same container as your classmates!
Now, the URIs that identify your entities should all be dereferenceable and you adhere to the third Linked Data principle: if someone sends a GET request (with RDF Browser, Postman or cURL) to https://semweb-emse.solidcommunity.net/public/sandbox/<your name>/<entity>#something, the platform should respond with triples about https://semweb-emse.solidcommunity.net/public/sandbox/<your name>/<entity>#something.
profile:me; adapt instructions to the identifier you chose). You'll now use the LDP feature that allows you to edit your data. Download a Turtle representation of profile:me and edit the resulting file by adding a triple stating that you know LDP: profile:me foaf:topic_interest db:Linked_Data_Platform. Re-upload the edited Turtle file using a PUT request to profile:me (remember that this short form in fact corresponds to a full URI: use the full URI in your request). What response status do you get? Write down the answer in the text file.
ETag response header. Send a GET request to profile:me to know that resource's ETag (<some ETag>). You can now re-send your PUT request and indicate to the platform that you wish to update the resource only if the ETag you provide matches the ETag the platform has. To that end, add the header If-Match: <some ETag> header to your PUT request. What response status do you now get?
semweb). You can now link your resources and theirs by adding RDF triples. Add foaf:knows triples to your favourite classmates in your description. Make sure the URIs they chose for their resources are dereferenceable.
If your browser is Firefox, use the RDF Browser extension in developer mode (accessible in the extension's settings, by clicking on Thanks to Victor Charpenay for the part on Linked Data Platforms.