1
Publishing and Using Cultural Heritage Linked Data
on the Semantic Web
Documentation Congress 2016, Vitoria-Gasteiz, Spain
Eero Hyvönen, Prof., DirectorAalto University and University of Helsinki
Heldig – Helsinki Centre for Digital Humanitieshttp://heldig.fi
Semantic Computing Research Group (SeCo)http://www.seco.tkk.fi/
2
Contents
• Background: History 2002-2016• Vision: Semantic Web of Cultural Heritage• Challenges: Content Complexity & Production• Solution: Linked Data Publishing Model ”Sampo”• Realization: Three Sampo Applications
4
History behind this talk Semantic Portals for Cultural Heritage
– 2004 MuseumFinland – Finnish Museums on the Semantic Web
» http://www.museosuomi.fi – 2008 CultureSampo – Finnish Culture on the Semantic Web 2.0
» http://www.kulttuurisampo.fi – 2011 BookSampo – Fiction Literature on the Semantic Web
» http://www.kirjasampo.fi – 2012 TravelSampo -- Mobile Contextualized Services of Cultural
Tourism» http://www.travelsampo.fi
– 2015 WarSampo – Finnish World War II on the Semantic Web» http://www.sotasampo.fi
5
Ontology and Data Services – 2009 National Ontology Library Service ONKI
» http://onki.fi – 2014 ONKI.fi -> Finto.fi of the National Library
» http://finto.fi – 2014 Linked Data Finland Data Service & Tools
» http://ldf.fi – 2016 Finnish Ontology Service for Historical Places and Maps
» http://hipla.fi Publications available at:
– http://www.seco.tkk.fi/publications/
6
• Over 30 resarchers at SeCo including Eetu Mäkelä, Tomi Kauppinen, Jouni Tuominen, Kim Viljanen, Tuukka Ruotsalo, Suvi Kettula, Kaisa Hypen, Erkki Heino, Petri Leskinen, Minna Tamper, Esko Ikkala, Mikko Koho, …
• Some 50 organizations involved
Joint Work During 2002-2016
Antikvaria-ryhmä
7
Helsinki Centre for Digital Humanities
A new platform for collaboration 2016-
http://heldig.fi
Heldig
8
The Vision: Semantic Web of Cultural Heritage
9
Wouldn’t it be nice if
cultural organizations could publish easily their contents together on the web just by pushing a button,
the contents would be automatically linked with other publishers contents and get semantically enriched,
researchers and citizens could contribute with their own contents and knowledge,
…
10
---
contents could be accessed easily from different thematic perspectives and contexts,
intelligent search and browsing systems could find answers to questions in addition to data records,
the aggregated contents and services could be reused in external applications easily,
language barriers could be overcome?
11
Yes!This is the vision of our work
12
Challenges:Content Complexity & Production
13
Problem 1: Cultural Content Compexity- Heterogenous and Interlinked
Encyclopedia
Artifacts Maps
Videos
Buildings
Fine artsBiographies
Narratives Literature
Cultural sites
Music
14
Problem 2: Cultural Content Production System- Distributed and Independent
Museums
Libraries
Archieves
Land survey
Linked Data
Web 2.0 sites
Media
Citizens
15
Solution:Linked Cultural Heritage Data
Ontology Infrastructure
”Sampo” Model for Semantic CH Portals
SemanticMetadata
ContentProviders
Land survey Museums
Archieves
Linked DataCitizens
Libraries
Web 2.0 sites
Media
17
How Does This Idea Work in Practise?
18
Biographical Registries Collect Data about Persons
henkilö nimi ammatti syntymapaikka ...H1 Akseli Gallen-Kallela taiteilija LemuH2 Gustaf Mannerheim marsalkka Askainen
...
H1
Lemu
ArtistPerson
”Akseli Gallen-Kallela”
H2
Askainen
Marshall
”Gustaf Mannerheim”
type
type
name
nanme
profession
profession
birthPlace
birthPlace
Biography Center
Person Name Profession Birth Place
19
Art Museum Catalogs Paintings
...
T1
1929
Painting
creator
time
type
”Gustaf Mannerheim”nimi
subject
name”Akseli Gallen-Kallela”
teos nimi tekijä aika aihe ...T1 Mannerheimin muotokuva Akseli Gallen-Kallela 1929 Gustaf MannerheimT2 Aino-triptyykki Akseli Gallen-Kallela 1891 Aino, Kalevala
...
Art Museum Collection
20
Land Survey Organizations Know Places
Varsinais-Suomen lääni Finland
Askainen
Lemu
Turku
kunta lääniAskainen Varsinais-Suomen lääniHelsinki Uudenmaan lääniLemu Varsinais-Suomen lääniTurku Varsinais-Suomen lääni...
part-ofpart-of
part-of
part-of
County
type
Province
type...
type Land Survey
21
Ontologies are Developed by Semantic Web Researchers
ArtistPerson
Marshall
Painting
Concept
Endurant
Place
Profession CountysubClassOf
TimePeriod
AbstractPerdurant
PhysicalObject
Province
KOKO-ontologySubclass Hierarchy
FinnONTO
subClassOf
subClassOf
subClassOf
subClassOf
22
RDF Connects and Harmonizes Linked Data into a GGG
H1
Lemu
ArtistPerson
”Akseli Gallen-Kallela”
H2
Askainen
Marshall
”Gustaf Mannerheim”
type
type
name
name
profession
profession
birthPlace
birthPlace
T1
1929
maalaus
tekijä
aiheaika
tyyppi
Varsinais-Suomen lääni Finland
Turku
part-of part-of
part-of part-of
Concept
Endurant
Place
Profession County
type
type
type
subClassOf
subClassOf
subClassOf
subClassOf
yläluokka
Time
subClassOfA bstractPerdurant
PhysicalObject
Province
yläluokka
...
PortalTriplestore
Serendipity: 1+1 > 2
23
Why is This Useful?Limitations of Non-semantic Data
• NBA-H26069-467 :object ”cup and plate” ; :material ”porcelain” ; :creationPlace ”Germany” ; :creator ”Meissen”.
• This metadata cannot answer the following queries/questions:– Find all vessels?– Find all ceramic products?– Find artifacts manufactured in Europe?– Does the city of Meissen manufacture ceramics?
24
Semantic Web Solution:Understanding the Ontological Context
NBA-H26069-467 :object ”cup and plate” ; :object_concept object:cup ; :object_concept object:plate ;
:material ”porcelain” ; :material_concept object:porcelain ;
:creationPlace ”Germany” ; :creationPlace_concept place:Germany ;
:creator ”Meissen” :creator_concept actor:Meissen .
NBA-H26069-467
place:Germany
object:cup
creationLocation_concept
place:Europe
loc:partOf
rdfs:subClassOf
object:vessel
object_concept
object_conceptobject:plate
rdfs:subClassOf
...
...
...
Find all vessels?Find all ceramic products?Find artifacts manufactured in Europe?Does the city of Meissen manufacture ceramics?
object ontology
place ontology
actor ontologymaterial ontology
place:Meissen
actor:Meissen
material:porcelain
material_conceptmaterial:ceramic
25
In Principle a Piece of Cake but …
30
Content Production System- Model- Standards & Best Practices- Ontology & Data Services- Annotation and other tools
Content Infrastructure- W3C etc. standards- Ontology Infrastructure - Metadata schemas - Domain ontologies- Linked Datasets
Portal - Humans: user-interfaceData Service- Machines: AJAX widgets, REST, Web Services, SPARQL
Cultural HeritagePortal System
The Components of a Semantic Portal
31
Content Production System Content Infrastructure
Visible Portal Application
32
Developers View to Linked Data:Rich Internet Applications (RIA)
Linked Data ServiceWWW Standard Model
Application 1 Client Side(Browser)
Application 2
Application N
Server Side
SPARQL End Point
Three Case Studies Using the Sampo Model
CultureSampoBookSampoWarSampo
34
CultureSampoFinnish Culture on the Semantic Web 2.0
(Hyvönen et al., Museums & Web 2009)(Mäkelä et al., SWJ 2012)
36
YSO
AFOMAO
TAO
VALO
KOKO ......
KOKO – Linked Open Ontology Cloud
Your ontology?
Aligning ontologiesONKI Ontology Service
(Hyvönen et al., ESWC 2009)
37
End-users’ view
YSO
AFOMAO
TAO
VALO
KOKO
KOKO Ontology
38
Sampo Component 2/3Content Creation Process
39
CultureSampo Content Providers (28+) :museums, libraries, archieves, researcher organizations, media companies + citizens
International Content Providers1 Geonames2 Google (Maps)3 Iconclass (vocab.)4 Panoramio5 Paul J. Getty Foundation (vocab.)6 Wikipedia
Finnish Content Providers1 Agricola – Suomen historiaverkko2 Espoon kaupunginmuseo3 Helsingin kaupunginkirjasto4 Hiihtomuseo5 Jyväskylän yliopisto, musiikin laitos6 Kansallisbiografia7 Kansallismuseo8 Kuopion kulttuurihistoriallinen museo9 Laatokan-Karjalan museo
10 Lahden kaupunginmuseo11 Museovirasto12 Pohjois-Karjalan museo13 Radio- ja TV-museo14 Seurasaaren ulkomuseo15 Suomalaisen Kirjallisuuden Seura SKS16 Suomen maatalousmuseo Sarka17 Suomen merimuseo18 Taideteollisen korkeakoulun kirjasto19 Valtion taidemuseo20 Veljekset Karhumäki Oy21 Viipurin historiallinen museo22 Yleisradio Oy
Thanks for cooperation!
40
Metadata SchemasMetadata schema Content type
1 artifact artifacts2 art paintings, sculpture, drawings, abstract art3 literature novels, short stories, comics4 WWW page WWW pages5 poetry 3 subtypes of poetry6 fictive object places and persons in Kalevala 7 folk music 5 subtypes of folk music8 photograph photographs9 aerial photograph aerial photos
10 actors persons, organizations11 biography biographies12 historical event historical events13 skill cultural process descriptions14 video documented processes15 built objects buildings etc. in nature16 archeological sites archeological sites
41
Data Alignment Principles Used Dublin Core like metadata schemas
– Element subproperty-of hierarchies– Dump-down principle
Harmonized element values– Taken from large shared domain ontologies– Objects, Actors, Places, Actions, ...
42
Events and Narratives as Semantic Glue
Events and narratives make cultural heritage alive!– Historical events
» Finnish history ontology– Events and processes of intangible cultural heritage
» Farming, arts & craft, …– Events in stories
» Semantic Kalevala (Finnish national Epic)
Preserving Intangible Cultural Heritage:Cataloging Boot Making Process
Espoolainen Onni Wirlander valmistaa saappaat
44
Process Graph of Making Leather Boots
(Kettula, Hyvönen, CIDOC 2012)
45
Skill Documentation:A Semantic Video Viewer
Semantic recommendationsSemantic process
description
Dynamic informationabout the video scene
46
Narrative Semantic Web- Case Semantic Kalevala
National epic of Finland Compiled by Elias Lönnrot from
a vast collection of folk poems Publication
– 1835 ”Old Kalevala”– 1849 ”New Kalevala”
» 50 poems, 22 795 lines– Translated into some 60 languages since 1841
Semantic Kalevala– First translation in a ”machine” language (RDF)!
48
Semantic Kalevala online:The computer ”understands” the national epic Kalevala
Semantically annotated 50 poems of the national epic
- Events and narratives
Translation into modern Finnish
Links to related art etc.
49
CultureSampo RDF Knowledge Base (March 17, 2009)
Metadata– 134,000 cultural collection items (artifacts, books, videos etc.)– 285,000 other resources (places, persons etc.)– 204 property types in metadata
Ontologies– KOKO ontologies (ca. 37,000 concepts)– Additional international vocabularies
» AAT, ULAN, Iconclass– 253 property types in ontologies
Size– 11,4 million triples
» 2,7 million triplets» 8,7 million additional reasoned triplets
50
Sampo Component 3/3The Semantic Portal
51
Portal Users
Humans– Semantic search– 9 thematic application perspectives
Machines (applications)– Via APIs
52
Semantic Search withResult Categorization
Nine Thematic Perspectives into Cultural Heritage
Three languages
Nine perspectives
Lately commented items
Lately viewed items
54
Objects on the Map (by semantic relation)
55
Historical Places an Areas on Maps
Semanticrecommendations
HistoricalHelsinki 1640-1945
56
Historical Maps on Google Maps
Wikipedia articlesPanoramio photos
Links to historical places
57
Relational Search:”How is A. Gallen-Kallela related to Napolean I ?”
58
Finnish History Ontology
External web page embedded in CultureSampo
Timeline of historical Events
59
Intangible Cultural Heritage
Semantic recommendationsSemantic process
description
Dynamic informationabout the video scene
60
Biography Perspective
Semantically annotated
biographies
Links to related works, persons,
places etc.
61
Semantic Kalevala:The computer ”understands” the national epic Kalevala
Semantically annotated 50 poems of the national epic
- Events and narratives
Translation into modern Finnish
Links to related art etc.
62
CultreSampo in Short
1. Highly cross-domain: 26 content types and 16 metadata schemas
2. Sophisticated semantic annotation models including events and processes
3. Semantic search and recommending techniques4. Versatile selection of semantic visualizations (map views,
timelines, graphs, process visualization, semantic video viewing)
5. Based on a large nation wide collaboratively maintained infrastructure of ontologies and ontology services
6. Includes models of and tools for collaborative semantic content creation
7. Services are available for machines, too.
63
BookSampoFiction Literature on the Semantic Web
(Mäkelä et al., ISWC 2011; SWJ 2013)
64
BookSampo – Finnish Fiction Literature on the Semantic Web Why
– Helping library customers in finding literature and related content How
– Semantic annotation of all Finnish fiction literature (for adults) Who
– Finnish Public Libraries and FinnONTO project When
– In use since 2011 (50 000+ monthly users)» http://www.booksampo.fi
(Mäkelä et al., IFLA 2012, SWJ 2012)
Key Idea: Data Services Supporting Linked Data in Applications
BookSampo LD
CultureSampo LDAPI
API
YOUR NEXTAPPLICATION
API
LDF.fi SPARQLEndpoint
68
WarSampo – Linked DeathFinnish WW2 on the Semantic Web
(Hyvönen et al., ESWC 2016)
69
”We learn from history, that we learn nothing from history”
Georg Wilhelm Friedrich Hegel
?
70
WarSampo: two components
Data Service for Linked Open Data http://ldf.fi
Applications based on the servicehttp://sotasampo.fi
Linked Data Service Applications (7)
72
WarSampo Linked Open Data Cloud
7.6 milllion triples in the knowledge graph
73
Conceptual Data Model Core: Extending CIDOC CRM
74
Semantic Portal http://sotasampo.fi/en 7 Perspectives to War
More info: [Hyvönen et al., ESWC 2016; Koho et al., WHiSe 2016]
In-use semantic portal 2015Nearly 20 000 end-users during first 3 days
1. Events 2. Persons 3. Army Units
4. Places 5. Deaths 6. Memoirs
7. Photos
75
Use Cases
Providing massive amounts of heterogenous WW2 data Reassembling the lives of the dead Supporting Digital Humanities Research
77
Using the Data in Digital Humanities
- Data analysis & visualizations
Casualties of the 33rd infantry regiment
78
Conclusions: Semantic Web Makes a Difference
End-user’s perspective– Global view to heterogeneous, distributed contents– Automatic content aggregation– Semantic search– Semantic browsing and recommendations– Other intelligent services (knowledge discovery, personalization,
visualization, …) Content publisher’s perspective
– Distributed content creation– Enriching each other’s contents semantically– Automated link maintenance– Shared content publication channel – Reusing aggregated content in other applications
79
But the Lunch is not Free
More collaboration is need -> complicates work Integration of semantic portals with legacy systems Manual annotations are costly and may not scale up Automatic annotation lowers data quality
80
”Intellectuals solve problems - geniuses prevent them”
Albert Einstein
Key Lesson Learned: create high quality semantic data
when cataloging
81
81
http://seco.cs.aalto.fi/publications
https://www.amazon.com/Publishing-Cultural-Heritage-
Synthesis-Technology/dp/1608459977