antvaset.com
/
My blog
Let's explore the Nobel Prize dataset

Let's explore the Nobel Prize dataset

An overview of the official Nobel Prize Linked Data dataset with some example SPARQL queries.

Published on by Anton Vasetenkov
Updated on

Since 1901, the Nobel Prizes and the Prizes in Economic Sciences have been awarded 597 times. 950 people and organisations have received the award in the following categories: Physics, Chemistry, Physiology or Medicine, Literature, Peace, and Economic Sciences.

The official Nobel Prize Linked Data dataset is an authoritative source of information about Nobel Prizes and laureates. Importantly, the Nobel Prizes are often shared between multiple people, and the same person or organisation can receive multiple Nobel Prizes. RDF works really well for representing such relationships.

The RDF vocabulary for expressing Nobel Prizes as Linked Data

The Nobel Prize dataset both reuses classes and properties from existing vocabularies and utilises some custom classes and properties that are defined in the http://data.nobelprize.org/terms/ (nobel) namespace. For example, the nobel:Laureate class represents a person or organization that receives a Nobel Prize and is a subclass of foaf:Agent:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix nobel: <http://data.nobelprize.org/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

nobel:Laureate a owl:Class .
nobel:Laureate rdfs:subClassOf foaf:Agent .
nobel:Laureate rdfs:label "Laureate" .

The same data in JSON-LD:

{
    "@context": {
        "foaf": "http://xmlns.com/foaf/0.1/",
        "nobel": "http://data.nobelprize.org/terms/",
        "owl": "http://www.w3.org/2002/07/owl#",
        "rdfs": "http://www.w3.org/2000/01/rdf-schema#"
    },
    "@id": "nobel:Laureate",
    "@type": "owl:Class",
    "rdfs:label": "Laureate",
    "rdfs:subClassOf": {
        "@id": "foaf:Agent"
    }
}

The full ontology can be downloaded here.

The Nobel Prize and laureate data

In the Nobel Prize Linked Data dataset, the URIs of all instances begin with http://data.nobelprize.org/resource/. For example, the fact that Wilhelm Röntgen was awarded the 1901 Nobel Prize in Physics is represented as follows:

@prefix nobel: <http://data.nobelprize.org/terms/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<http://data.nobelprize.org/resource/nobelprize/Physics/1901> nobel:laureate <http://data.nobelprize.org/resource/laureate/1> .
<http://data.nobelprize.org/resource/laureate/1> nobel:nobelPrize <http://data.nobelprize.org/resource/nobelprize/Physics/1901> .
<http://data.nobelprize.org/resource/nobelprize/Physics/1901> rdfs:label "Physics 1901" .
<http://data.nobelprize.org/resource/laureate/1> rdfs:label "Wilhelm Conrad Röntgen" .

The same data in JSON-LD:

{
    "@context": {
        "nobel": "http://data.nobelprize.org/terms/",
        "rdfs": "http://www.w3.org/2000/01/rdf-schema#"
    },
    "@id": "http://data.nobelprize.org/resource/nobelprize/Physics/1901",
    "rdfs:label": "Physics 1901",
    "nobel:laureate": {
        "@id": "http://data.nobelprize.org/resource/laureate/1",
        "rdfs:label": "Wilhelm Conrad Röntgen",
        "nobel:nobelPrize": {
            "@id": "http://data.nobelprize.org/resource/nobelprize/Physics/1901"
        }
    }
}

The Nobel Prize data is available via the D2R Server running at http://data.nobelprize.org/. A SNORQL server is provided at http://data.nobelprize.org/snorql/ so that we can issue SPARQL queries directly in the browser.

Querying the data

You can run some interesting queries directly in NobelPrize.org's SNORQL browser. Some example queries are described below.

All Nobel Laureates who were born in New Zealand

This query will list all Nobel Laureates who were born in New Zealand:

SELECT ?laureate ?laureateLabel
WHERE {
    ?laureate a nobel:Laureate .
    ?laureate rdfs:label ?laureateLabel .
    ?laureate dbpedia-owl:birthPlace <http://data.nobelprize.org/resource/country/New_Zealand> .
}

Try this query in NobelPrize.org's SNORQL

Query result:

laureatelaureateLabel
<http://data.nobelprize.org/resource/laureate/374>"Maurice Hugh Frederick Wilkins"
<http://data.nobelprize.org/resource/laureate/167>"Ernest Rutherford"
<http://data.nobelprize.org/resource/laureate/730>"Alan G. MacDiarmid"

All Nobel Laureates who were born in New Zealand in Chinese

The Nobel Prize dataset not only uses existing established vocabularies but is also linked to other RDF datasets through owl:sameAs. For example, Nobel Prize Linked Data contains the following statement about Wilhelm Röntgen:

<http://data.nobelprize.org/resource/laureate/1> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/Wilhelm_Röntgen> .

Using SPARQL federation we can simultaneously query all other linked data sources. This makes it possible to issue a wide range of queries that go beyond the Nobel Prize dataset.

The Nobel Prize dataset only specifies the names of the laureates in English, but since it has links to DBPedia for each laureate, we can get get the names of all Nobel Laureates who were born in New Zealand in Chinese using a single query:

SELECT ?laureate ?laureateLabel ?dbpediaLaureateLabel
WHERE {
    ?laureate a nobel:Laureate .
    ?laureate rdfs:label ?laureateLabel .
    ?laureate dbpedia-owl:birthPlace <http://data.nobelprize.org/resource/country/New_Zealand> .
    ?laureate owl:sameAs ?dbpediaLaureate .
    SERVICE <http://dbpedia.org/sparql> {
        ?dbpediaLaureate rdfs:label ?dbpediaLaureateLabel
        FILTER (lang(?dbpediaLaureateLabel) = "zh")
    }
}

Try this query in NobelPrize.org's SNORQL

Query result:

laureatelaureateLabeldbpediaLaureateLabel
<http://data.nobelprize.org/resource/laureate/374>"Maurice Hugh Frederick Wilkins""莫里斯·威爾金斯"
<http://data.nobelprize.org/resource/laureate/167>"Ernest Rutherford""欧内斯特·卢瑟福"
<http://data.nobelprize.org/resource/laureate/730>"Alan G. MacDiarmid""艾伦·麦克德尔米德"

People and organisations that received more than one Nobel Prize

To find out which Nobel laureates were honoured with the award multiple times, this query can be used:

SELECT ?laureate ?laureateLabel (COUNT(?prize) AS ?prizeCount)
WHERE {
    ?laureate a nobel:Laureate .
    ?laureate rdfs:label ?laureateLabel .
    ?prize a nobel:NobelPrize .
    ?laureate nobel:nobelPrize ?prize .
}
GROUP BY ?laureate ?laureateLabel
HAVING (?prizeCount > 1)
ORDER BY DESC(?prizeCount)

Try this query in NobelPrize.org's SNORQL

Query result:

laureatelaureateLabelprizeCount
<http://data.nobelprize.org/resource/laureate/482>"Comité international de la Croix Rouge (International Committee of the Red Cross) "3
<http://data.nobelprize.org/resource/laureate/217>"Linus Carl Pauling"2
<http://data.nobelprize.org/resource/laureate/222>"Frederick Sanger"2
<http://data.nobelprize.org/resource/laureate/515>"Office of the United Nations High Commissioner for Refugees (UNHCR) "2
<http://data.nobelprize.org/resource/laureate/6>"Marie Curie, née Sklodowska"2
<http://data.nobelprize.org/resource/laureate/66>"John Bardeen"2

Youngest Nobel laureates

The youngest recipients of the award in all categories can be retrieved using the following query:

SELECT ?laureate ?laureateLabel (?laureateAwardYear - year(?laureateBirthday) AS ?laureateAgeWhenAwarded)
WHERE {
    ?laureate a nobel:Laureate .
    ?laureate rdfs:label ?laureateLabel .
    ?laureate foaf:birthday ?laureateBirthday .
    ?laureateAward a nobel:LaureateAward .
    ?laureate nobel:laureateAward ?laureateAward .
    ?laureateAward nobel:year ?laureateAwardYear .
}
ORDER BY ASC(?laureateAgeWhenAwarded)
LIMIT 5

Try this query in NobelPrize.org's SNORQL

Query result:

laureatelaureateLabellaureateAgeWhenAwarded
<http://data.nobelprize.org/resource/laureate/914>"Malala Yousafzai"17
<http://data.nobelprize.org/resource/laureate/21>"William Lawrence Bragg"25
<http://data.nobelprize.org/resource/laureate/38>"Werner Karl Heisenberg"31
<http://data.nobelprize.org/resource/laureate/40>"Paul Adrien Maurice Dirac"31
<http://data.nobelprize.org/resource/laureate/43>"Carl David Anderson"31

Resources

See also

Why federation is a game-changing feature of SPARQL
SPARQL federation is an incredibly useful feature for querying distributed RDF graphs.
A network of drugs: The New Zealand Medicines Terminology
An overview of New Zealand's drug vocabulary.
Linked data for the enterprise: Focus on Bayer's corporate asset register
An overview of COLID, the data asset management platform built using semantic technologies.
RDF* and the onset of Linked Data* and the Semantic Web*
The evolution of RDF and the related technologies fuelled by the need to make statements about statements.
A beginner's guide to graph embeddings
Understanding what graph embeddings are and why they are important for graph analytics.

Thanks for stopping by my digital playground! If you want to say hi, you can reach out to me on LinkedIn or via email. I'm always keen to chat and connect.

If you really-really like my work, you can support me by buying me a coffee.