Flamenco on the Web


Flamenco on the Web
Flamenco on the Web Sergio Oramas Overview • 
Structured vs Unstructured data • 
Flamenco on the Web • 
FlaBase: A Flamenco Music Knowledge Base 3 Data Sources Structured vs. Unstructured 4 Structured Data Sources • 
Knowledge bases (DBpedia, Freebase, Wikidata, etc.) Databases Web APIs Linked Data Markups (schema.org) 5 Knowledge Bases: a PragmaKc DefiniKon A knowledge base (KB) is a comprehensive semanKcally organized machine-­‐readable collecKon of universally relevant or domain-­‐specific enKKes, classes, and facts (aQributes, relaKons) • 
plus spaKal and temporal dimensions plus commonsense properKes and rules plus contexts of enKKes and facts (textual & visual witnesses, descriptors, staKsKcs) plus ….. 6
Knowledge Bases: a PragmaKc DefiniKon A knowledge base (KB) is a comprehensive semanKcally organized machine-­‐readable collecKon of universally relevant or domain-­‐specific enKKes, classes, and facts (aQributes, relaKons) • 
plus spaKal and temporal dimensions plus commonsense properKes and rules plus contexts of en11es and facts (textual & visual witnesses, descriptors, staKsKcs) plus ….. 7
Today's knowledge bases > 60 Bio. subject-­‐predicate-­‐object triples from > 1000 sources + Web tables ReadTheWeb
BabelNet SUMO
ConceptNet 5 WikiTaxonomy/
Cloud diagram from http://lod-cloud.net/
details> 8
Music Knowledge Bases • 
Structured informaKon about music is incomplete – 
Only popular bands and western music Only editorial and some biographical informaKon Knowledge ExtracKon • 
Huge amount of music informaKon remains implicit in non-­‐
structured texts – 
ArKsts biographies Album reviews Band pages User’s posts Unstructured Data Sources • 
Web pages Social Networks Blogs Forums 11 SemanKc Web • 
The Seman1c Web aims at converKng the current web, dominated by unstructured and semi-­‐structured documents into a web of linked data. • 
Achievements – 
Common framework for data representaKon and interconnecKon (RDF, ontologies) SemanKc technologies to annotate texts (EnKty Linking) Language for complex queries (SPARQL) Linked Open Data Cloud Wikipedia and DBpedia -­‐  Digital Encyclopedia -­‐  Unstructured -­‐  Keyword search -­‐  Knowledge Base -­‐  Structured -­‐  Query search 13 14 15 DBpedia • 
Dbpedia example queries – 
Cantaores born in Sevilla in the 50s Guitarists that collaborated with Cantaores from Huelva Dbpedia graph applicaKons – 
EnKty Relevance EnKty Similarity EnKty RecommendaKon 16 Google Knowledge Graph 17 ApplicaKons • 
Complex queries and Q&A Music RecommendaKon & Discovery Music Browsing and Search ArKst Similarity Music disseminaKon InformaKon visualizaKon Flamenco on the Web • 
InformaKon is spread in different webs Most of the informaKon only in Spanish None trustable and complete repository MusicBrainz and Wikipedia have liQle informaKon about Flamenco 19 Why a Knowledge Base of Flamenco Music? • 
Collect disperse informaKon Create a central, trustable and mulKlingual repository about Flamenco ApplicaKons – 
Culture disseminaKon Research dataset Discover new knowledge from the data 20 Examples of Culture-­‐specific Knowledge Bases • 
Dunya Linked Jazz 21 Flamenco Data Sources: Structured • 
DBpedia MusicBrainz Freebase Proprietary Databases Flamenco Data Sources: Unstructured • 
Flamenco Webs – 
Biographical Discographical InsKtuKonal / EducaKonal Peñas (Flamenco associaKons) Journals / Magazines Forums / Social Networks Blogs 23 Some Flamenco Links • 
centroandaluzflamenco/Directorio/listadolinks.php www.andalucia.org/en/flamenco/ (English) theflamencoworld.com/ (English) www.elartedevivirelflamenco.com/ ares.cnice.mec.es/flamenco discografiasflamencas.blogspot.com.es/ flun.cica.es/index.php/grabaciones/base-­‐datos-­‐grabaciones www.deflamenco.com/ losfardos.blogspot.com.es/ 24 FlaBase: A Flamenco Music Knowledge Base 25 Methodology 26 Data AcquisiKon • 
Wikipedia/DBpedia: 438 resources in Spanish + 281 in English Andalucia.org: 422 arKst biographies (English and Spanish), 76 palos El arte de vivir el flamenco: 749 arKst biographies (Spanish) MusicBrainz: 814 releases and 9,942 songs CICA database: 2,099 releases and 4,136 songs 27 EnKty ResoluKon • 
Pair-­‐wise string similarity F-­‐measure Ej.: Niña de la Puebla = La Niña de la Puebla Paco de Lucía ≠ Pepe de Lucía 28 EnKty ResoluKon 29 Knowledge ExtracKon: EnKty Linking • 
Is the task to associate, for a textual fragment, the most suitable entry in a reference Knowledge Base. Two tasks – 
Named EnKty RecogniKon Named EnKty DisambiguaKon Camarón de la Isla ...This is how the end of the relaKon with Paco de Lucía come to pass... State of the art systems for EL: Babelfy, Tagme, DBpedia Spotlight 30 Knowledge ExtracKon: EnKty Linking • 
CreaKon of a domain-­‐specific enKty linking tool that idenKfies enKKes of FlaBase in Spanish texts. IdenKficaKon of arKsts, palos and locaKons EvaluaKon of 3 different approaches on 49 manually annotated enKKes in 3 arKst biographies 31 Knowledge ExtracKon: Event ExtracKon • 
ExtracKon of year and place of birth from every biography – 
Use of our enKty linking tool Nearest place enKty and year of the word nació (was born) EvaluaKon over 442 manually annotated birth places 0,92 of precision and 0,65 of recall CAMARÓN DE LA ISLA, nació en San Fernando (Cádiz) en el número 29 de la calle del Carmen, el día 5 de Diciembre de 1950 Populate the knowledge base with structured knowledge 32 ApplicaKon: ArKst Relevance • 
CreaKon of a graph of enKty menKons using our enKty linking Graph of enKty menKons = Graph of hyperlinks Calculate enKty relevance using PageRank and HITS 33 ArKst Relevance • 
Flamenco expert evaluaKon PageRank Top-­‐5 arKsts Precision Values 34 FlaBase • 
Data gathered – 
1,102 ArKsts (text biography) 74 Palos (flamenco genres) 2,860 Albums 13,311 Tracks 771 Andalusian locaKons Knowledge Extracted – 
Place of birth Date of birth EnKty menKons in text 35 FlaBase • 
Number of arKsts by year of birth 36 FlaBase 37 FlaBase: A Flamenco Music Knowledge Base hQp://mtg.upf.edu/download/datasets/flabase Oramas S., Gómez F., Gómez E., Mora J. (2015). FlaBase: Towards the crea6on of a Flamenco Music Knowledge Base. 16th InternaKonal Society for Music InformaKon Retrieval Conference ISMIR 2015. 38 

Similar documents

Speaker Slides - D Mac Sithigh

Speaker Slides - D Mac Sithigh Enlighten…   Dr.  Daithí  Mac  Síthigh   Lecturer  in  Digital  Media  Law   University  of  Edinburgh  

More information