Publishing Web data

Publishing RDF on the Web

Use case: I want to publish my personal profile in RDF, with my name, affiliation, interests, education, professional relationships, etc.
Simple conceptual model but...
- what IRI should I use (for myself, my company, etc.)?
- what properties?
- where do I put the data?
- how do I make the data easily usable?
- ...
See also: Best Practices for Publishing Linked Data – W3C Note 9 January 2014

2

Linked Data Principles

Use URIs as names for things
Use HTTP URI so that people can look up those names
When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
Include links to other URIs. so that they can discover more things.

See: Linked Data. Tim Berners-Lee’s design issues. July 2006 (revised June 2009)

See also: Tim Berners-Lee’s TED talk. Feb. 2009

3

Linked Data Principles

Use IRIs as names for things
Use HTTP IRI so that people can look up those names
When someone looks up a IRI, provide useful information, using the standards (RDF*, SPARQL)
Include links to other IRIs. so that they can discover more things.

4

Data should go FAIR!

To maximise (re)usability of data, they should be:
The FAIR Guiding principles for data management and stewardship
- Findable
- Accessible
- Interoperable
- Reusable
There is a worldwide initiative supported by many governments and public agencies to push towards FAIR data: Go FAIR
Linked Data and Semantic Web technologies provide a way to implement FAIR principles

5

Dereferenceing

Dereferenceing: operation that consists in using an IRI as a URL to get whatever document you can access using that URL
Corresponds to issueing a GET method in HTTP, with the URL stripped of any fragment identifier
An IRI is dereferenceable if it can be used in a HTTP GET request to access a document

6

What do HTTP URIs identify?

Rule of thumb:: If a URL directly locates a document then the URL must identify the document

How do we identify things that are not documents (physical objects, people, ideas, etc.)?
- Non HTTP URIs? → breaks rule n°2 of Linked Data
- HTTP URIs that do not locate documents (e.g., gives 404) → breaks rule n°3 of Linked Data

7

W3C Technical Architecture Group advice

If the server returns 200 OK to an IRI look up, then the IRI must denote an information resource (≈ a Web document)
Otherwise, the IRI may denote anything
Advice: to identify non-information resources, use either “hash IRIs” or [303-redirected] “slash IRIs”

Warning: controversial decision of the TAG, discussions on this issue have been occasionnally showing up on mailing lists since 2002!

8

Slash IRIs (1)

A slash IRI is an IRI with a ‘/’ followed by a local name:

                http://dbpedia.org/resource/Semantic_Web

issue a GET request:

                GET /resource/Semantic_Web HTTP/1.1
                Host: dbpedia.org
                Accept: text/html

server replies:

                HTTP/1.1 303 See Other
                Location: http://dbpedia.org/page/Semantic_Web

issue a new GET request:

                GET /page/Semantic_Web HTTP/1.1
                Host: dbpedia.org
                Accept: text/html

server replies:
```
                HTTP/1.1 200 OK
```

9

Slash IRIs (1)

issue a GET request:

                GET /resource/Semantic_Web HTTP/1.1
                Host: dbpedia.org
                Accept: application/rdf+xml

server replies:

                HTTP/1.1 303 See Other
                Location: http://dbpedia.org/data/Semantic_Web

issue a new GET request:

                GET /data/Semantic_Web HTTP/1.1
                Host: dbpedia.org
                Accept: application/rdf+xml

server replies:
```
                HTTP/1.1 200 OK
```

10

Hash IRIs

A hash IRI is an IRI with a fragment identifier:

                http://danbri.org/foaf#danbri

HTTP GET always removes fragment, so a hash IRI cannot be used to return 200 OK
- so it can be used for non-information resources
Advantages/disadvantages of hash vs. slash IRIs

See also: Cool URIs for the Semantic Web – W3C Interest Group Note 3 December 2008

11

Means of publishing RDF

Put RDF files online (in RDF/XML, Turtle, etc)
Publish RDF along with web pages (RDFa & JSON-LD)
- Some CMS generate RDF automatically (e.g., Drupal 7+)
- You’ll see more about JSON-LD later
Generate RDF from other existing formats
- Triplifiers
- Mapping languages:
  - For relational DBs: W3C R2RML and Direct Mapping
  - For other formats: XSLT, RML, SPARQL-Generate
Keep RDF inside database, but provide access via queries (SPARQL endpoints)

12

Existing online RDF datasets

The Linked Open Data Cloud (LOD Cloud)
List of SPARQL endpoints and availability (SPARQL Endpoint Status)

13

Finding existing vocabularies

Reuse well known vocabularies (Dublin Core, schema.org, FOAF, SIOC, Good Relations, SKOS, voiD, etc.)
Try an ontology / vocabulary search engine or repository:
- Search engines: ~~FalconS~~ 💀, ~~SWSE~~ 💀, ~~Sindice~~ (integrated in proprietary software), OU’s ~~Watson~~ 💀, ~~Swoogle~~ 💀, ~~vocab.cc~~ 💀
- Repositories: Linked Open Vocabulary, Ontology Design Patterns, prefix.cc, BioPortal (specialised in bio-medical ontoloies), AgroPortal (specialised in agriculture-related ontologies), ~~SchemaWeb~~ 💀, ~~Schemapedia~~ 💀, ~~Cupboard~~ 💀, ~~Knoodl~~, ~~DERI vocabularies~~ 💀, ~~OWL Seek~~ 💀, ~~SchemaCache~~ 💀
Ask mailing lists, forums (semantic-web@w3.org, stackoverflow.com, Answers knowledge graph)

14

Build your own vocabulary

Editors:
- Protégé, WebProtégé, NeOn TK, SWOOP, Neologism, TopBraid Composer (commercial software), PoolParty (commercial product), OWLGrEd, Fluent Editor, Semantic Turkey, VocBench, ~~Vitro~~ 💀, ~~Knoodl~~ 💀, ~~Ontofly~~ 💀, ~~Altova OWL editor~~ 💀, ~~IBM integrated development TK~~ 💀, ~~Anzo for Excel~~ 💀, ~~Euler GUI~~
Learn, evaluate:
- Protégé tutorial, …bits and pieces here and there
- RDF validator, OWL validator, Linked Data validator
- Best practices for publishing RDF vocabularies
Link to other ontologies... more at http://www.w3.org/wiki/Ontology_Dowsing

15

Case 1: Build linked data from text

Describe in RDF the following situation:

Marco is a student at Université Jean Monnet, studying in the Master 2 programme Web Intelligence. There, he follows the course Semantic Web, taught by Antoine. Marco is italian but lives in Saint-Étienne, place Jean Jaurès, with his friends and flat mates Enrico and José. Marco is interested in Web technologies, theater and sci-fi literature. Enrico is interested in marijuana, reggae and is an activist for worldwide peace. Antoine Zimmermann is associate professor at École des mines, with colleagues Victor Charpenay, Maxime Lefrançois, etc. École des mines is a higher education establishment depending on the Ministry of industry.

16

Case 2: Build linked data from existing data

Translate the following tables to RDF:

TeamID	Name	Country	Coach
FRA	XV de France	France	Laporte
NZL	All Blacks	New Zealand	Henry
ENG	XV of the Rose	England	Ashton
…	…	…	…

PlayerID	Name	TeamID	Position
1	Vincent Clerc	FRA	wing
2	Lionel Beauxis	FRA	flyhalf
3	Joe Rokocoko	NZL	wing
…	…	…	…

17

Case :UML to RDF vocabulary

Usually, these translations are appropriate:

UML classes → RDF classes
UML attribute → RDF properties with literals as range
UML links → RDF properties
generalization → rdfs:subClassOf
Visibility and methods are normally not represented in RDF (it’s not a programming language)
Cardinalities cannot be represented with RDFS, but can in OWL (cf. future courses), but be careful!
Note: in RDF, properties are not attached to classes. They are first class citizens.

18

RDF files and RDF APIs

RDF files (RDF/XML, Turtle, N-triples, etc.) can be read into memory with RDF APIs
The in-memory model of an RDF graph can be manipulated with API methods
- Java APIs: Apache Jena (part of documentation in French), RDF4J
- .NET C#: dotNetRdf
- Python: RDFLib
- PHP: EasyRDF
- Javascript: RDF.js
- Many more

19

Storing and managing RDF

RDF files (RDF/XML, Turtle, N-triples, etc.) can be read into memory with RDF APIs
Some triple stores scale up to trillions of RDF triples, given enough hardware: GraphDB, AllegroGraph, Virtuoso, Stardog, Amazon Neptune…
Small capacity triple stores (good for quick development of simple Web apps): Jena Fuzeki, Sesame, and others

20

Linked Data Platform

@base <http://example.com/ldp/>
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix ldp: <http://www.w3.org/ns/ldp#> .
<subfolder>  a  ldp:Container;
    dct:created  
        "2021-10-01T13:30:30+02:00"^^xsd:dateTime;
    ldp:contains
      <this>,
      <that>,
      <it> .

21

Using a Linked Data Platform

Interaction via HTTP requests
GET requests access data
POST requests create resources with an RDF graph and place them in a container
PUT requests update specific resources
some metadata is added in passing...
See demo!

22

Publishing Web data

Master DSC & CPS²

Antoine Zimmermann

Use arrows to navigate through the slides
￫ or ￬ to go one slide forward
￩ or ￪ go one slide backward
↖ to go to the first slide
End to go to the last slide

Publishing RDF on the Web

Linked Data Principles

Linked Data Principles

Data should go FAIR!

Dereferenceing

What do HTTP URIs identify?

W3C Technical Architecture Group advice

Slash IRIs (1)

Slash IRIs (1)

Hash IRIs

Means of publishing RDF

Existing online RDF datasets

Finding existing vocabularies

Build your own vocabulary

Case 1: Build linked data from text

Case 2: Build linked data from existing data

Case :UML to RDF vocabulary

RDF files and RDF APIs

Storing and managing RDF

Linked Data Platform

Using a Linked Data Platform

Publishing Web data

Master DSC & CPS²

Antoine Zimmermann

Use arrows to navigate through the slides ￫ or ￬ to go one slide forward ￩ or ￪ go one slide backward ↖ to go to the first slideEnd to go to the last slide

Publishing RDF on the Web

Linked Data Principles

Linked Data Principles

Data should go FAIR!

Dereferenceing

What do HTTP URIs identify?

W3C Technical Architecture Group advice

Slash IRIs (1)

Slash IRIs (1)

Hash IRIs

Means of publishing RDF

Existing online RDF datasets

Finding existing vocabularies

Build your own vocabulary

Case 1: Build linked data from text

Case 2: Build linked data from existing data

Case :UML to RDF vocabulary

RDF files and RDF APIs

Storing and managing RDF

Linked Data Platform

Using a Linked Data Platform

Use arrows to navigate through the slides
￫ or ￬ to go one slide forward
￩ or ￪ go one slide backward
↖ to go to the first slide
End to go to the last slide