aMMAI

Transcription

aMMAI
Introduction –
Advanced Topics in Multimedia Analysis and
Indexing (aMMAI)
Winston H. Hsu
National Taiwan University, Taipei
February 18, 2009
Office: R512, CSIE Building
Communication and Multimedia Lab (通訊與多媒體實驗室)
http://www.csie.ntu.edu.tw/~winston
Outline
  Introduction
  Lecture style
  Assessments
  Logistic issues
-2-
aMMAI, Spring 2009 – Winston Hsu
1
The Multimedia Analysis and Search Problem
(multi-modal) query
media repository
produced video
corpora
Personal media
Online media shares
results
Digital life records
“Topic177 - Find shots of a daytime demonstration
or protest with at least part of one building visible”
  Explosive growth of image and video content on Web, broadcasts, personal media,…
–  5B images on Web
–  31M hours of TV per year
–  200K YouTube uploads / day (~73M video uploads / year)
  Growing deluge requires more effective solutions for searching & browsing digital media
–  Exciting opportunity!
3
aMMAI, Spring 2009 – Winston Hsu
Video (photo) search can (or need to) be improved by
integrating content, context, and semantic
representations!!
Enormous (but noisy) multimedia data are publically
available and promising for many applications!!
4
aMMAI, Spring 2009 – Winston Hsu
2
RECAP – Multimedia Analysis and Indexing (MMAI), Fall
2007/2008
  Preliminarily understanding the design and implementation of search,
classification, clustering for multimedia (including images, videos, and
music)
  Understanding basic statistical tools for high-dimensional and largescale (multimedia) data analysis
  Evaluating the performance of multimedia systems
  Identifying current research problems in multimedia analysis and
retrieval
aMMAI, Spring 2009 – Winston Hsu
-5-
RECAP – Multimedia Analysis and Indexing (MMAI), Fall
2007/2008
.)
tc
sed
Ba ring
g
e
ntnin
nte is/Filt
/Mi
Co
s
n
aly
atio
An
val
ariz
mm
Su
trie
Re
Signal Processing
ant
Machine Learning
Sem
Information Retrieval
and
nt
ex
Eve
Ind
nd
pt a on
i
g
nce ect
rin
Co Det
ctu
tru
ic S
to
k
um
or
ho ons
P
tw
/
c
e
e
,
N
st
ag
e
al
ch
Im dca
ag
ci
e
r
o
a
e
m
e
lI
/S
ro
m
Sp
an
ica
su eo (b
ic/
d
s
n
m
e
M
Co Vid
Hu
Mu
Data
aMMAI
,e
er
Applications
Advanced Researches
-6-
aMMAI, Spring 2009 – Winston Hsu
3
Course Goals for aMMAI
  Extending breadths and depths for essential technical components
for MMAI in feature representations and learning
  Gaining practical experiences through assignments and experiments
  Practicing paper critiques, summarization, and presentations
aMMAI, Spring 2009 – Winston Hsu
-7-
Expected Audience
  Students enthusiastic about the research topics in multimedia
analysis and indexing
–  Weekly participations required
–  Critiquing and summarizing two papers weekly
  Students preliminarily understanding related disciplines such as
MMAI, machine learning, pattern recognition, computer vision, etc.
-8-
aMMAI, Spring 2009 – Winston Hsu
4
Example – Selective among Rich Learning Methods
Context
SpatialTemporal
Multi-Modal
Multi-Concept
Unsupervised
Temporal Mining
Unsupervised
w. Side Info.
Window-based
Discriminative:
CRF,…
Generative:
HMM, …
Ranking
Multi-view
Semi-supervised
Joint Text-Image
Semi-supervised
Context
Insensitive
Multi-Instance
Clustering
Unsupervised
ZoneTag
Multimodal
Fusion
Image
Annotation
Object
Recognition
Multi-concept
Modeling
Brain Signal
Speech
Cross-domain
Active
Learning
Standard Model
(Generative, Discrim.)
Supervised Learning
Gaming
Browsing
Tagging
Manual
Supervision
aMMAI, Spring 2009 – Winston Hsu
9
Course Style
  Overviews of image representation, feature extraction, and search
methods (1 week)
  Student participations
–  collectively review, critique, and experiment with a set of selected papers
–  Each student will be assigned one (or two) paper(s) to summarize the
technical content as well as related development in the field. (2-3 students/
week)
  Providing image/video data sets, features, and associated metadata
(such as transcripts) for (3-4) assignments in this class.
-10-
aMMAI, Spring 2009 – Winston Hsu
5
Assessment
  10% Class participation (interactions)
–  Passive and active Q&A
  30% Paper critiques and summaries (weekly)
–  Graded by TA, lecturer, and students
  30% Oral presentations
–  presentation
–  slides
–  sample codes/data sets (strongly recommended)
  30% Assignments
  Bonus (final reports or projects, optional)
aMMAI, Spring 2009 – Winston Hsu
-11-
Assessment (cont.)
  Paper critiques & summarization
–  Creating blogs or web pages
–  Posting paper reading before 12pm, the lecture day, including
•  Novelties, contributions, assumptions
•  Questions and promising applications
•  Technical summarizes
–  Lecturer & TA will grade the posts
–  See others’ paper interpretations as well
–  At end, each student marks top 5 students; overall critique scores are
rated by the “PageRank” algorithm.
–  Examples for paper critiques will be given over the course web page.
-12-
aMMAI, Spring 2009 – Winston Hsu
6
(Tentative) Assignments
  3 or 4 practical implementations based on MATLAB or C (C++)
  Benchmark data provided
  Essential techniques for advanced researches
  Cornerstones to promote research capabilities
  Topics
–  Feature reduction – PCA (Eigenface)
–  Probabilistic Latent Semantic Analysis (pLSA)/LDA
–  Adaboost
–  Graph Cut – Segmentation
aMMAI, Spring 2009 – Winston Hsu
-13-
Topics Covering Techniques for MMAI
14
week #
date
1
02/18/09
2
02/25/09
3
03/04/09
4
03/11/09
5
03/18/09
6
03/25/09
7
04/01/09
8
04/08/09
9
04/15/09
10
04/22/09
11
04/29/09
12
05/06/09
13
05/13/09
14
05/20/09
15
05/27/09
16
06/03/09
17
06/10/09
18
06/17/09
planning
introduction
MMAI recap
Interesting points and local descriptors
Dimension Reduction + manifold methods 
shape and texton representations
Latent semantic analysis 
Efficient indexing methods
Annotations for photos and persons
mid-term exam. (break)
boosting methods (Jiebo Luo's talk) 
multiple instance and semi-supervised learning
Gaphical models
Variational inferences 
Spectral clustering
Frequent itemset and association
Network analysis
Ranking Methods
final exam. (break)
aMMAI, Spring 2009 – Winston Hsu
7
Conventional Content-Based Image Retrieval
[before 1999]
retrieved images
query image
feature
extraction
Image
Database


distance
metric



feature (vector) space
(indexing)
aMMAI, Spring 2009 – Winston Hsu
-15-
Semantic and Content-Based Image Retrieval
query image
* Graphical Models
* MRF
* Variational Inference
[after 1999]
retrieved images
* Ranking
* Annotations/Classifications
* Boosting
feature
extraction
* Multiple
instance and
Semi-supervised Learning
* Efficient Indexing
* Network analysis
Image
Database
* Interesting points and
local
descriptors
* shape representations and
matching
* Dimension Reduction and
manifold methods
* Language Models and Latent
Semantic Analysis


distance
metric



feature (vector) space
(indexing)
* Frequent Itemset and association
* Spectral Clustering
-16-
aMMAI, Spring 2009 – Winston Hsu
8
How to Read a Paper
  Keshav, “How to Read a Paper,” CCR, 2007
  3 Phases
aMMAI, Spring 2009 – Winston Hsu
-17-
How to Write a Paper
  Henning Schulzrinne, “Writing Technical Articles,” http://
www.cs.columbia.edu/ hgs/etc/writing-style.html
  And other rich information in his page
-18-
aMMAI, Spring 2009 – Winston Hsu
9
Presentation
  These technical papers are fun and useful but require much more
time than you imagined!!
  OK for using others’ materials but acknowledging the sources
  Slides, provided examples, and codes, will be collected for other
students’ references
  Tips for presentations (see Henning’s page)
aMMAI, Spring 2009 – Winston Hsu
-19-
Logistic Issues
  Rule – “deliver quality work on time with integrity!!”
  TA
–  TBA
  Course information
–  Readings, homework, slides, etc.
–  http://www.csie.ntu.edu.tw/~winston/courses/ammai/
–  Mailing list:
https://cmlmail.csie.ntu.edu.tw/mailman/listinfo/ammai
or google “ammai, cmlab”
-20-
aMMAI, Spring 2009 – Winston Hsu
10
Next Week
  MMAI recap by TA
–  Video feature representations, shot segmentation
–  Image feature representations, content-based image retrieval
–  Basic mathematics tools
•  Probability 101, Entropy, Mutual Information, etc.
  Paper critique and summaries (due next week)
–  How to read technique papers
–  How to deliver research presentations
–  “Image Retrieval: Ideas, Influences, and Trends of the New Age,” Datta,
2008 (comprehensive and long)
aMMAI, Spring 2009 – Winston Hsu
-21-
:: backup slides::
-22-
aMMAI, Spring 2009 – Winston Hsu
11
Semantic Gap
Photo with cheering crowds, taken on
July 29, 2006, during the Hot Air
Balloon Festival in New Jersey, USA
Semantic Richness
Content Descriptors
sky
hot air balloon
crowds
 end goal
object segmentation, regions
Image-level Descriptors
pixel intensity, texture, color histogram,
date, etc.
Raw Media
-23-
 to work with
aMMAI, Spring 2009 – Winston Hsu
12

Similar documents