Multimedia in Folksonomies

Transcription

Multimedia in Folksonomies
Trend Detection in
Folksonomies
Andreas Hotho
Robert Jäschke
Christoph Schmitz
Gerd Stumme
University of Kassel, Germany
1
Idea
• Creating proper semantic metadata is expensive
• Collaborative annotation of multimedia resources
works in practice
• Compute popularity of resources over time: FolkRank
• Track popularity change
• Display winners and losers
2
Multimedia in Folksonomies: Web Bookmarks
3
Multimedia in Folksonomies: Web Bookmarks
User
Resource
Tags
4
Multimedia in Folksonomies: Audio Streams
5
Multimedia in Folksonomies: Audio Streams
Tags
Users
Resource
6
Multimedia in Folksonomies: Photographs
7
Multimedia in Folksonomies: Photographs
User
Resource
Tags
8
Multimedia in Folksonomies: Video
9
Multimedia in Folksonomies: Video
Resource
User
Tags
10
Multimedia in Folksonomies
• Many more resource types:
–
–
–
–
–
–
News
Blog entries
Geo coordinates
Catalog items
Personal goals
Intranet contents
digg.com
technorati.com
tagzania.com
amazon.com
43things.com
IBM, Microsoft
11
Folksonomy Model
• A folksonomy F = (U, T, R, Y) consists of
– Users U
– Tags T
– Resources R
– Tag assignments Y ∈ U × T × R
• Can be seen as
– ternary relation
– tripartite hypergraph G = ((U ∪ T ∪ R), Y)
User 1
Tag 2
User 2
Tag 3
User 3
Res 3
Res 1
Res 2
12
Folksonomy Model
• Tagging = adding tag assignments to Y:
U
schmitz
schmitz
schmitz
schmitz
T
2006
conference
event
samt
R
http://www.samt2006.org
http://www.samt2006.org
http://www.samt2006.org
http://www.samt2006.org
13
Ranking in Folksonomies: FolkRank
• PageRank in the web: pages are important if a lot of
important pages are linking to them
• Possible to regard user preferences
• Folksonomy: spread authority along hyperedges
 Resources are important if they are tagged by
important users with important tags
• Symmetrically for users and tags
14
Ranking in Folksonomies: FolkRank
●
➔
Problems of folksonomy-adapted PageRank
● dominated by graph structure
● undirected: weight flows back
Differential approach
● compute rank with and without preferences
● FolkRank = difference between those rankings
● normalized to [0,1]
Details: Hotho et al. Information Retrieval in Folksonomies: Search and
Ranking. Proc. ESWC 2006, Budva, Montenegro, June 2006.
15
Evaluation Dataset
●
Data from the del.icio.us folksonomy site
●
Monthly snapshots by timestamp
●
June 2004 to July 2005
●
July 2005 consists of
●
|U| = 70,581 users
●
|T| = 434,187 tags
●
|R| = 3,978,927 resources
●
|Y| = 16,236,429 tag assignments
16
Baseline (w/o empty tag): Top Tags
0.4
0.35
art
blog
css
design
java
linux
music
news
politics
programming
software
web
Folkrank
0.3
0.25
0.2
0.15
0.1
0.05
2
4
6
8
Month
10
12
14
17
Baseline (w/o empty tag): Politics
0.4
0.35
Folkrank
0.3
0.25
politics
0.2
0.15
US Election
November 2004
0.1
0.05
2
4
6
8
Month
10
12
14
18
Trends with respect to tag “politics”
0.011
activism
bushco
bush
election
humor
iraq
media
usa
war
0.010
0.009
Folkrank
0.008
0.007
0.006
0.005
0.004
0.003
0.002
0.001
2
4
6
8
Month
10
12
14
19
Trends with respect to tag “politics”: Losers
0.011
activism
bushco
0.010
0.009
humor
Folkrank
0.008
0.007
war
0.006
0.005
0.004
0.003
0.002
0.001
2
4
6
8
Month
10
12
14
20
Trends with respect to tag “politics”: US Election
0.011
0.010
bush
election
0.009
Folkrank
0.008
0.007
0.006
0.005
0.004
0.003
US Election
November 2004
0.002
0.001
2
4
6
8
Month
10
12
14
21
Trends with respect to tag “politics”: Unchanged
0.011
0.010
0.009
iraq
media
usa
Folkrank
0.008
0.007
0.006
0.005
0.004
0.003
0.002
0.001
2
4
6
8
Month
10
12
14
22
Trend Detection: Popularity Change
• Track movements in FolkRank weighting
• Determine winners and losers
• Popularity change between times t0 and t1:
pct t   x :=
0
• f0, f1:
• c:
1
f 1  xc
f 0  x c
⋅
n1
n0
normalized FolkRank at t0, t1
damping constant, fix fi = 0
• n0, n1: no. of elements at times t0, t1
• Non-existing elements x at time ti: fi(x) := 0
23
Popularity Change
time
FolkRank (f0)
FolkRank (f1)
high: 1
ain
g
ty
lari
opu
p
e
g
r
a
l
0.1
rity gain
ula
small pop
e
larg
ain
g
rity
a
l
u
pop
0.01
popul
a
rity lo
s
s
low: 0
24
Pop. Change, Semantic Web, May/June 2005
Resource
PC
f (May)
r (May)
http://shirky.com/writings/ontology_overrated.html 5.594
0.001
28598
0.389
4
http://simile.mit.edu/piggy-bank/
3.279
0.286
21
1.000
1
http://mezzoblue.com/downloads/markupguide/
2.581
0.124
39
http://simile.mit.edu/piggy-bank/index.html
2.252
0.182
24
0.076
79
0.063
104
0.058
130
0.057
134
0.053
150
0.053
151
0.044
http://www.betaversion.org/~stefano/linotype/news/89/
2.029
http://www.alvit.de/web-dev/
1.873
http://www.w3.org/2005/Talks/05-steven-xtech/
1.814
f (June) r (June)
377
0.000
46028
-
-
http://thecommunityengine.com/home/archives/2005/05/xfolk_entry_04.html
1.802
-
-
http://hardware.slashdot.org/article.pl?sid=05/05/27/1253220&from=rss
1.755
-
-
http://www.neuroticweb.com/recursos/del.icio.us-graphs/
1.754
-
-
http://www.semanticwebsearch.com/
1.391
0.028
713
0.055
139
http://www.w3.org/2004/Talks/17Dec-sparql/
1.299
0.037
494
0.054
143
0.081
142
0.089
61
http://www.dlib.org/dlib/april05/hammond/04hammond.html
1.201
25
Pop. Change, Semantic
Web, May/June 2005
Rant: Ontologies are
Overrated
Resource
PC
f (May)
r (May)
http://shirky.com/writings/ontology_overrated.html 5.594
0.001
28598
0.389
4
http://simile.mit.edu/piggy-bank/
3.279
0.286
21
1.000
1
http://mezzoblue.com/downloads/markupguide/
2.581
0.124
39
http://simile.mit.edu/piggy-bank/index.html
2.252
0.182
24
0.076
79
0.063
104
0.058
130
0.057
134
0.053
150
0.053
Article:
Social
Bookmarking Tools
151
0.044
http://www.betaversion.org/~stefano/linotype/news/89/
2.029
http://www.alvit.de/web-dev/
http://www.w3.org/2005/Talks/05-steven-xtech/
f (June) r (June)
377
-
-
1.873
0.000
46028
Piggy
Bank: SemWeb
Tool
@ ESWC 2005
1.814
-
-
http://thecommunityengine.com/home/archives/2005/05/xfolk_entry_04.html
1.802
-
-
http://hardware.slashdot.org/article.pl?sid=05/05/27/1253220&from=rss
1.755
-
-
http://www.neuroticweb.com/recursos/del.icio.us-graphs/
1.754
http://www.semanticwebsearch.com/
1.391
0.028
713
0.055
139
http://www.w3.org/2004/Talks/17Dec-sparql/
1.299
0.037
494
0.054
143
0.081
142
0.089
61
http://www.dlib.org/dlib/april05/hammond/04hammond.html
1.201
26
Conclusion
• Trend detection on triadic folksonomy data
– works on all kinds of content
– users contribute metadata
• FolkRank
– Spreading weights across hyperedges
– Rank by user preferences
• Trend Detection
– Track development of FolkRank over time
– Popularity change measure
:
m
te .org
s
sy omy
r
u on
o
Try .bibs
w
w
w
27
Backup Slides
28
FolkRank: Details
• PageRank in the web: pages are important if a lot of
important pages are linking to them
• Authority values in a folksonomy are propagated along
the hyperlink structure of the folksonomy
 Resources are important, if they are tagged by
important users with important tags
• Similiar for users and tags
29
FolkRank: Details
• Set V of nodes consists of the disjoint union of the sets
of tags, users and resources:
V = U ∪ T ∪ R
• All co-occurrences of tags and users, users and
resources, tags and resources become edges between
the respective nodes:
 E = {{u,t}, {t,r}, {u,r} | (u,t,r) ∈ Y}
• With each edge {u,t} being weighted with |{r ∈ R: (u,t,r)
∈ Y}|
• Similiar for {t,r} and {u,r} edges
30
FolkRank: Details
Original version of PageRank:
Fixed point R of the weight spreading R ← c(AR + P)
• A – degree weighted adjacency matrix reflecting the
graph
• P = α ⋅ 1 -- damping factor α
We spread weight as follows:
R← c(α R + β A R + y P)
31
FolkRank: Details
Problem with the original PageRank version:
 Graph is undirected  weight flows in one direction
and directly “swashes back”
Idea to solve this is to apply a differential approach:
 Let RAP be the fixed point with γ = 0.
 Let Rpref be the fixed point with γ > 0.
 R := Rpref - RAP is the final weight vector.
32

Similar documents

Ranking and community detection in unweighted networks

Ranking and community detection in unweighted networks Crawl statistics (KDE) Crawled del.icio.us in July 2005 – 75,242 users – 533,191 different tags – 3,158,297 different resources – 17,362,212 total tag assignments (TAS) More recent crawl (TAGora, ...

More information

Web 2.0

Web 2.0 „ U, T, and R are finite sets, whose elements are called users, tags and resources, „ Y w U × T × R, called set of tag assignments, „ ƒ w U × T × T is a user-specific sub-tag/super-tag relation.

More information

Information Retrieval in Folksonomies: Search and Ranking

Information Retrieval in Folksonomies: Search and Ranking Problem with the adapted PageRank version: • Graph is undirected Æ weight flows in one direction and directly “swashes back” Idea to solve this is to apply a differential approach: • Let RAP be the...

More information