Multimedia in Folksonomies
Transcription
Multimedia in Folksonomies
Trend Detection in Folksonomies Andreas Hotho Robert Jäschke Christoph Schmitz Gerd Stumme University of Kassel, Germany 1 Idea • Creating proper semantic metadata is expensive • Collaborative annotation of multimedia resources works in practice • Compute popularity of resources over time: FolkRank • Track popularity change • Display winners and losers 2 Multimedia in Folksonomies: Web Bookmarks 3 Multimedia in Folksonomies: Web Bookmarks User Resource Tags 4 Multimedia in Folksonomies: Audio Streams 5 Multimedia in Folksonomies: Audio Streams Tags Users Resource 6 Multimedia in Folksonomies: Photographs 7 Multimedia in Folksonomies: Photographs User Resource Tags 8 Multimedia in Folksonomies: Video 9 Multimedia in Folksonomies: Video Resource User Tags 10 Multimedia in Folksonomies • Many more resource types: – – – – – – News Blog entries Geo coordinates Catalog items Personal goals Intranet contents digg.com technorati.com tagzania.com amazon.com 43things.com IBM, Microsoft 11 Folksonomy Model • A folksonomy F = (U, T, R, Y) consists of – Users U – Tags T – Resources R – Tag assignments Y ∈ U × T × R • Can be seen as – ternary relation – tripartite hypergraph G = ((U ∪ T ∪ R), Y) User 1 Tag 2 User 2 Tag 3 User 3 Res 3 Res 1 Res 2 12 Folksonomy Model • Tagging = adding tag assignments to Y: U schmitz schmitz schmitz schmitz T 2006 conference event samt R http://www.samt2006.org http://www.samt2006.org http://www.samt2006.org http://www.samt2006.org 13 Ranking in Folksonomies: FolkRank • PageRank in the web: pages are important if a lot of important pages are linking to them • Possible to regard user preferences • Folksonomy: spread authority along hyperedges Resources are important if they are tagged by important users with important tags • Symmetrically for users and tags 14 Ranking in Folksonomies: FolkRank ● ➔ Problems of folksonomy-adapted PageRank ● dominated by graph structure ● undirected: weight flows back Differential approach ● compute rank with and without preferences ● FolkRank = difference between those rankings ● normalized to [0,1] Details: Hotho et al. Information Retrieval in Folksonomies: Search and Ranking. Proc. ESWC 2006, Budva, Montenegro, June 2006. 15 Evaluation Dataset ● Data from the del.icio.us folksonomy site ● Monthly snapshots by timestamp ● June 2004 to July 2005 ● July 2005 consists of ● |U| = 70,581 users ● |T| = 434,187 tags ● |R| = 3,978,927 resources ● |Y| = 16,236,429 tag assignments 16 Baseline (w/o empty tag): Top Tags 0.4 0.35 art blog css design java linux music news politics programming software web Folkrank 0.3 0.25 0.2 0.15 0.1 0.05 2 4 6 8 Month 10 12 14 17 Baseline (w/o empty tag): Politics 0.4 0.35 Folkrank 0.3 0.25 politics 0.2 0.15 US Election November 2004 0.1 0.05 2 4 6 8 Month 10 12 14 18 Trends with respect to tag “politics” 0.011 activism bushco bush election humor iraq media usa war 0.010 0.009 Folkrank 0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 2 4 6 8 Month 10 12 14 19 Trends with respect to tag “politics”: Losers 0.011 activism bushco 0.010 0.009 humor Folkrank 0.008 0.007 war 0.006 0.005 0.004 0.003 0.002 0.001 2 4 6 8 Month 10 12 14 20 Trends with respect to tag “politics”: US Election 0.011 0.010 bush election 0.009 Folkrank 0.008 0.007 0.006 0.005 0.004 0.003 US Election November 2004 0.002 0.001 2 4 6 8 Month 10 12 14 21 Trends with respect to tag “politics”: Unchanged 0.011 0.010 0.009 iraq media usa Folkrank 0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 2 4 6 8 Month 10 12 14 22 Trend Detection: Popularity Change • Track movements in FolkRank weighting • Determine winners and losers • Popularity change between times t0 and t1: pct t x := 0 • f0, f1: • c: 1 f 1 xc f 0 x c ⋅ n1 n0 normalized FolkRank at t0, t1 damping constant, fix fi = 0 • n0, n1: no. of elements at times t0, t1 • Non-existing elements x at time ti: fi(x) := 0 23 Popularity Change time FolkRank (f0) FolkRank (f1) high: 1 ain g ty lari opu p e g r a l 0.1 rity gain ula small pop e larg ain g rity a l u pop 0.01 popul a rity lo s s low: 0 24 Pop. Change, Semantic Web, May/June 2005 Resource PC f (May) r (May) http://shirky.com/writings/ontology_overrated.html 5.594 0.001 28598 0.389 4 http://simile.mit.edu/piggy-bank/ 3.279 0.286 21 1.000 1 http://mezzoblue.com/downloads/markupguide/ 2.581 0.124 39 http://simile.mit.edu/piggy-bank/index.html 2.252 0.182 24 0.076 79 0.063 104 0.058 130 0.057 134 0.053 150 0.053 151 0.044 http://www.betaversion.org/~stefano/linotype/news/89/ 2.029 http://www.alvit.de/web-dev/ 1.873 http://www.w3.org/2005/Talks/05-steven-xtech/ 1.814 f (June) r (June) 377 0.000 46028 - - http://thecommunityengine.com/home/archives/2005/05/xfolk_entry_04.html 1.802 - - http://hardware.slashdot.org/article.pl?sid=05/05/27/1253220&from=rss 1.755 - - http://www.neuroticweb.com/recursos/del.icio.us-graphs/ 1.754 - - http://www.semanticwebsearch.com/ 1.391 0.028 713 0.055 139 http://www.w3.org/2004/Talks/17Dec-sparql/ 1.299 0.037 494 0.054 143 0.081 142 0.089 61 http://www.dlib.org/dlib/april05/hammond/04hammond.html 1.201 25 Pop. Change, Semantic Web, May/June 2005 Rant: Ontologies are Overrated Resource PC f (May) r (May) http://shirky.com/writings/ontology_overrated.html 5.594 0.001 28598 0.389 4 http://simile.mit.edu/piggy-bank/ 3.279 0.286 21 1.000 1 http://mezzoblue.com/downloads/markupguide/ 2.581 0.124 39 http://simile.mit.edu/piggy-bank/index.html 2.252 0.182 24 0.076 79 0.063 104 0.058 130 0.057 134 0.053 150 0.053 Article: Social Bookmarking Tools 151 0.044 http://www.betaversion.org/~stefano/linotype/news/89/ 2.029 http://www.alvit.de/web-dev/ http://www.w3.org/2005/Talks/05-steven-xtech/ f (June) r (June) 377 - - 1.873 0.000 46028 Piggy Bank: SemWeb Tool @ ESWC 2005 1.814 - - http://thecommunityengine.com/home/archives/2005/05/xfolk_entry_04.html 1.802 - - http://hardware.slashdot.org/article.pl?sid=05/05/27/1253220&from=rss 1.755 - - http://www.neuroticweb.com/recursos/del.icio.us-graphs/ 1.754 http://www.semanticwebsearch.com/ 1.391 0.028 713 0.055 139 http://www.w3.org/2004/Talks/17Dec-sparql/ 1.299 0.037 494 0.054 143 0.081 142 0.089 61 http://www.dlib.org/dlib/april05/hammond/04hammond.html 1.201 26 Conclusion • Trend detection on triadic folksonomy data – works on all kinds of content – users contribute metadata • FolkRank – Spreading weights across hyperedges – Rank by user preferences • Trend Detection – Track development of FolkRank over time – Popularity change measure : m te .org s sy omy r u on o Try .bibs w w w 27 Backup Slides 28 FolkRank: Details • PageRank in the web: pages are important if a lot of important pages are linking to them • Authority values in a folksonomy are propagated along the hyperlink structure of the folksonomy Resources are important, if they are tagged by important users with important tags • Similiar for users and tags 29 FolkRank: Details • Set V of nodes consists of the disjoint union of the sets of tags, users and resources: V = U ∪ T ∪ R • All co-occurrences of tags and users, users and resources, tags and resources become edges between the respective nodes: E = {{u,t}, {t,r}, {u,r} | (u,t,r) ∈ Y} • With each edge {u,t} being weighted with |{r ∈ R: (u,t,r) ∈ Y}| • Similiar for {t,r} and {u,r} edges 30 FolkRank: Details Original version of PageRank: Fixed point R of the weight spreading R ← c(AR + P) • A – degree weighted adjacency matrix reflecting the graph • P = α ⋅ 1 -- damping factor α We spread weight as follows: R← c(α R + β A R + y P) 31 FolkRank: Details Problem with the original PageRank version: Graph is undirected weight flows in one direction and directly “swashes back” Idea to solve this is to apply a differential approach: Let RAP be the fixed point with γ = 0. Let Rpref be the fixed point with γ > 0. R := Rpref - RAP is the final weight vector. 32
Similar documents
Ranking and community detection in unweighted networks
Crawl statistics (KDE) Crawled del.icio.us in July 2005 – 75,242 users – 533,191 different tags – 3,158,297 different resources – 17,362,212 total tag assignments (TAS) More recent crawl (TAGora, ...
More informationWeb 2.0
U, T, and R are finite sets, whose elements are called users, tags and resources, Y w U × T × R, called set of tag assignments, w U × T × T is a user-specific sub-tag/super-tag relation.
More informationInformation Retrieval in Folksonomies: Search and Ranking
Problem with the adapted PageRank version: • Graph is undirected Æ weight flows in one direction and directly “swashes back” Idea to solve this is to apply a differential approach: • Let RAP be the...
More information