COMPUTER VISION

Transcription

COMPUTER VISION
COMPUTER VISION
MEI/1
University of Beira Interior, Departament of Informatics
Hugo Pedro Proença, [email protected], 2014/2015
Class Time & Etc
Tempos
Mon.
R
Tue.
R
Wed.
R
Thu
R
Fri
R
8-9
-
-
-
-
-
-
-
-
-
-
9-10
-
-
-
-
-
-
-
-
-
-
10-11
-
-
-
-
-
-
-
-
-
-
11-12
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Theorethical
6.17
12-13
-
-
13-14
-
-
-
-
-
-
-
-
-
-
14-15
-
-
-
-
-
-
-
-
-
-
15-16
-
-
-
-
-
-
-
-
-
-
16-17
-
-
-
-
-
-
-
-
17-18
-
-
-
-
-
-
-
-
18-19
-
-
-
-
-
-
-
-
Practical
6.19
-
-
-  Office: 4.11; Bloco 6
-  Office Hour: Wednesdays, 14:00 – 16:00
-  Course URL: http://www.di.ubi.pt/~hugomcp/visaoComp
- Marks, Announcements, Exercises, Data sets
Evaluation Criteria (PT)
! 
Assiduidade (A)
! 
! 
Trabalho Prático (P)
! 
Os trabalhos práticos da disciplina contribuem em 12 valores para a classificação final na disciplina.
! 
A aprovação à disciplina requer a nota mínima de 6 valores (6/20) nos trabalho práticos.
! 
! 
" 
Entrega P2 (5 valores): 24 de Abril de 2015, 23:59, via email.
Apresentações (A): As aulas práticas serão parcialmente destindas à apresentação pelos alunos de
trabalhos científicos seleccionados. (Mínimo: 2 trabalhos p/ aluno) (2 valores)
Prova (F1) - 3ª feira, 26 de Maio de 2015, 11:00-13:00, Sala 6.17
C=P1*5/20+P2*5/20+A+F*8/20
Admissão a Exame ! 
! 
Entrega P1 (5 valores): 20 de Março de 2015, 23:59, via email.
Classificação Ensino/Aprendizagem (C)
! 
! 
" 
Frequência
! 
! 
A aprovação à disciplina está condicionada a níveis mínimos de assiduidade de 80% nas aulas.
Consideram-se admitidos a Exame os alunos que obtiverem classificação mínima de 6 valores na
componente de Ensino-Aprendizagem.
Exames
! 
A nota do trabalho prático é sempre considerada para atribuição da nota final
Course Projects
Part I
! 
! Head
Detection
Part II
! 
! Object
Tracking
3.5 + 3.5 values = 7
! 
Requirements
! 
In order to be succeeded in this course, it is strongly
suggested that students have the following skills and
knwoledges:
! 
Programming experience in structured language:
" 
" 
" 
" 
! 
! 
Functions
Parameters
Iterative Blocks
Conditional Blocks
MATLAB will be mainly used during the pratical classes.
Elementary notions about:
" 
" 
" 
" 
Linear Algebra
Probability and Statistics
Geometry
Artificial Intelligence
Course Summary
! 
! 
Introduction
!  Computer Vision (CV). What is it?
!  Goals of CV: Why are they so hard?
!  Applications of CV
!  Biologic Perspective Vision
Cameras and Images
!  Optics
!  Digital Images
!  Sampling
!  Calibration
Course Summary
Filtering
! 
Convolution, Correlation
Spatial and Frequency Domains
! 
! 
! 
Fourier Transform
Noise Sources
! 
! 
! 
! 
! 
Gaussian Noise
Impulse Noise
Median Filter
Gaussian Filter
Image Representation
! 
! 
! 
! 
! 
! 
Features
Image Derivatives
First Derivatives: Edges
Sobel Detector
Canny Detector
Course Summary
Shape detection
! 
! 
Data Descriptors
! 
! 
! 
! 
! 
! 
! 
Color
Texture
Shape
Motion
Optical Flow
Clustering
Stereo Vision
Classifiers
! 
! 
! 
! 
Hough Transform
Probabilistic Models
Detection, Segmentation and Recognition
Books, Textbooks
! 
! 
Main
! 
David A. Forsyth and Jean Ponce; Computer Vision: A Modern Approach,
Prentice-Hall, 2002.
! 
Dana Ballard and Chris Brown; Computer Vision, Online.
! 
J. R. Parker; Algorithms for Image Processing and Computer Vision, Wiley, 1995.
Complementary
! 
! 
! 
Torras, C.; Computer Vision, Theory and Industrial Applications, New York,
Springer, 1992.
Davies, E.R.; Machine Vision: Theory, Algorithms, Practicalities, Third Edition,
Morgan Kaufmann, 2005.
List of on-line books
http://homepages.inf.ed.ac.uk/rbf/CVonline/books.htm#online
Links(cont)
!  Compendium
about Computer Vision: http://
homepages.inf.ed.ac.uk/rbf/CVonline/
!  Dicitionary of Computer Vision and Image Processing:
http://homepages.inf.ed.ac.uk/rbf/CVDICT/
!  Vision research groups:
http://www.cs.cmu.edu/~cil/v-groups.html
!  C / C++ library for image/video processing:
http://cimg.sourceforge.net/
!  MATLAB help : http://www.mathworks.com/access/
helpdesk/help/techdoc/matlab.shtml
Computer Vision? What is It?
! 
! 
! 
! 
! 
! 
Trucco and Verri: “Computing properties of the 3-D world from one or
more digital images”.
Sockman and Shapiro: “To make useful decisions about real physical
objects and scenes based on sensed images”.
Ballard and Brown: “The construction of explicit, meaningful description of
physical objects from images”.
Forsyth and Ponce: “Extracting descriptions of the world from pictures or
sequences of pictures”.
English Dictionary: “The use of digital computer techniques to extract,
characterize, and interpret information in visual images of a threedimensional world”.
Wikipedia: “Computer vision is the science and technology of machines that
see. As a scientific discipline, computer vision is concerned with the theory
for building artificial systems that obtain information from images”.
Computer Vision? What is It?
! 
! 
It can be considered a full Artificial Intelligence problem
As such, it is possible to regard it “simply” as a “signal-tosymbol” converter.
! 
In oppostition to “Computer Graphics”, which can be
regarded as a “symbol-to-signal” converter
Perception
Computer
Vision
Symbols(s)
ß
Π
Þ
Ø
Computer Vision? What is It?
! 
Intercepts a broad range of disciplines:
Optics
Robotics
Image Processing
Pattern Recognition
Optics
! 
Is related to the biological sensing process of vision (the way light is handled by
human brain).
!  Describes the behavior of light and its interaction with matter.
!  Three maisn types of light can be identified, using as reference visible
wavelength: infra-red, visible a ultra-violet.
!  However, being a radiation, similar fenomena occur in x-rays, micro-waves,
radio waves or any other type of radiation (interaction between charged
particles)
Image Processing
! 
! 
Way of processing signals, having as input bi-dimensional
signals.
Transformation of the original data, in order to make easier
further interpretation phases.
! 
! 
! 
! 
! 
! 
! 
Geometric transforms (scale, rotation, translation, affine and
projective transforms).
Color or intensity adjustement
Data / region reconstruction
Data registration
Detection
Segmentation
recognition
Pattern Recognition
! 
! 
! 
It is often considered the core of the Vision system
Pattern Recognition aims at classify / labelling the input
data
There are, typically, 3 variants:
! 
Statistics
! 
Structural
! 
Neural
Robotics
! 
! 
! 
Domain of knowledge that evolves planning and
development of phisical automata (robots)
It intersects electronical engineering, mechanics, computer
science and cybernetics areas.
Even though a precise definition is hard to find, a robot is
a machine that:
! 
! 
Has sensorial abbilities
It has the ability to actuate in the environment, and change its
state.
Computer Vision: Aplications
! 
Biometric Recognition
! 
Iris, face, gait, …
Computer Vision: Aplications
! 
Autonomous Driving, Navigation
! 
“Google car”:
! 
Stanford University:
Computer Vision: Aplications
! 
Medical Diagnosis
Computer Vision: Aplications
! 
Robotic Production/ Inspection Systems
! 
NASA’s autonomous
walker:
! 
Subsea 7 inspector
Computer Vision: Aplications
! 
Automatic Character recognition (OCR)
! 
NOKIA multi-scanner:
Computer Vision: Aplications
! 
Surveillance / Security Systems
! 
Current state:
Computer Vision: Aplications
! 
Defense Systems (Ballistics)
! 
China Dongfeng 21D with inflight autonomous
updates:
Computer Vision: Research
! 
! 
! 
One of the more active domains of knowledge, in the
Computer Science area.
Its in the earliest development stage, as there are not
autonomous and generic systems, with vision abilities
close to the human being.
We know that this type of vision-problems can be
solved, as humans do it for thousands of years.
However:
! 
! 
! 
How to represent knowledge?
What inference mechanisms should be created?
What is intelligence?
Computer Vision: Research
! 
International Conferences
! 
! 
! 
! 
! 
ICCV: International Conference on Computer Vision
CVPR: Computer Vision and Pattern Recognition
International Conference
ICPR: International Conference on Pattern Recognition
International Journals
!  Elsevier Image and Vision Computing
!  Elsevier Computer Vision and Image Understanding
!  IEEE Transactions on Image Processing
Hundreds of Research Groups
! 
! 
! 
Académicos: MIT, Stanford, Cambridge, UCLA, ...
Comerciais: Microsoft, IBM, Honda, Sarnoff, Panasonic, ...
http://www.cs.cmu.edu/~cil/v-groups.html
Computer Vision: Cohesive Perspective
3D World
Semantic Information
Data Acquisition
Data Processing
Computer Vision: Typical Stages
Pre-Processing
Computer Vision: Typical Stages
Detection
Computer Vision: Typical Stages
Segmentation
Computer Vision: Typical Stages
Normalization
Computer Vision: Typical Stages
Encoding
011010001010101010010101010
010101000101010101010101010
101010101010010101010101010
000100101010101010100010011
010100010101010010101010101
110101001010100101010010101
Computer Vision: Typical Stages
Matching
011010001010101010010101010
010101000101010101010101010
101010101010010101010101010
000100101010101010100010011
010100010101010010101010101
110101001010100101010010101
11010101010101001010101010100
00010101010101010101010101010
01010101000101010101010101010
01010101000001011110001010101
01010000101010010101001010100
10101010010101010010101010100
Computer Vision: Typical Stages
Classification
It’s a
Dear!
11010101010101001010101010100
00010101010101010101010101010
01010101000101010101010101010
01010101000001011110001010101
01010000101010010101001010100
10101010010101010010101010100
Vision: Ilusions
! 
What is the relationship between the color of
regions “A” and “B”?
Visão: Ilusão
! 
What kinds of motion this figure has?
Visão: Ilusão
! 
Classical M.C. Escher ilusions:
Vision: Why is it so Hard?
! 
! 
Most of the vision problems are ill-posed (bad formed), in
opposition to well-posed problems
Hadamard defined well-posed mathematic models those
with the following properties:
! 
! 
! 
! 
Not even considering other factors:
! 
! 
! 
There is a solution;
The solution is unique;
Solution depends exclusively from data.
Representation of 3D world by 2D data
The variation / noise associated with data acquisition turn most
vision problems ill-posed.
As such, errors are expected (should be simply minimized)
Vision: Why is it so Hard?
! 
The process of representing 3D data in two
dimensions brings many ambiguities to the
represented data:
Vision: Why is it so Hard?
! 
Ambiguities: