Smooth generic camera calibration for optical

Transcription

Smooth generic camera calibration for optical
Smooth generic camera calibration
for optical metrology – the concept
Alexey Pak
Fraunhofer Institute of Optronics, System Technologies and Image Processing (IOSB)
Vision and Fusion Lab (IES), Informatics Department, Karlsruhe Institute of Technology KIT
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
Digital camera as a metrological instrument
• Modern cameras are fast, inexpensive, use non-coherent light
• Typical parameters: 3 colors @ 256 intensity levels (8-bit value resolution), O(103)
lines / field of view of O(10) degrees (angular resolution of O(10-4) rad)
• Shape measurement: laser triangulation, deflectometry, shape-from-X, …
• Modern methods aim at high precision (down to O(10) nm), use complex optical
schemes, require non-trivial data processing
Need metrological-quality camera model and adequate calibration tools!
Camera
Illumination
Studied object
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
2
Types of camera calibration
acc. to Hanning ’11
Photometric calibration
• Goal: find relation between the intensity of light incident on a
sensor element and the output (digital) pixel value
• EMVA 1288 standard, calibrated once by manufacturer
• Not considered in this talk
Abbildung 3: Bildentstehung (links) und mathematisches Kameramodell (rechts) (Quelle: EMVA 1288)
Extrinsic geometrical calibration
• Goal: establish camera position
and orientation with respect
16. GMA/ITG-Fachtagung Sensoren und Messsysteme 2012
to the global system of coordinates (SoC)
Global SoC
• 6 parameters: SoC origin and three rotation angles
Intrinsic geometrical calibration
• Goal: characterize the imaging geometry in the camera’s own SoC
• No “standard” camera model, always model-dependent parameters
Note:
• Extrinsic and intrinsic calibration are not independent
• Both need to be performed after any camera/lens adjustment
Ideal geometrical calibration: universal, stable, non-degenerate, unbiased model; simple
calibration procedure; clear characterization of the resulting uncertainty.
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
3
37
Generic camera model (assume geometrical optics)
Imaging model in camera’s SoC may be specified in terms of the two mappings:
Direct mapping (3D point to sensor point):
2D sensor
space
p = (u, v)T
Real 3D space
Camera
Sensor limits
Observed object point
Inverse mapping (sensor
point to set of 3D points):
2D sensor
space
p = (u, v)T
Real 3D space
All objects on the ray {o,
r}project to point π on sensor
Camera
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
4
Generic camera model (further details)
An image is produced by sampling the sensor space at discrete points:
2D Sensor space
p i ,i =1,..., N
• The sensor space parameterization, grid
geometry, and the region boundaries may be
chosen arbitrarily (e.g., (u, v) in [0, 1]2)
• It makes sense to exploit the natural
(physical) sensor continuity and layout
Images are always blurred (i.e. have finite sharpness):
Some distribution of intensity;
π understood as its central point
Some spatial distribution of the light field
(beam); ray {o, r} is its central axis
¶f
(p )
¶p
Camera
p = (u, v)T
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
5
Global and local ambiguities in camera models
The choice of the camera’s SoC is arbitrary:
Rotation by ΔR, translation by Δt
New intrinsic
parameters:
• A 6-dimensional global symmetry
Global SoC
Ray specification via direction and origin is ambiguous:
• The coefficients α, β may be arbitrary functions of u, v
• An ∞-dimensional local symmetry
"a, b Î R
Any calibration model needs a recipe to fix those freedoms!
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
6
What kind of data can be used for calibration?
e.g. Zhang ‘00
OpenCV calibration: de-facto standard in Computer Vision
• Produce images of a flat calibration pattern from several different camera poses
• Pattern contains a set of recognizable features at known (x, y, z) locations
Several (a priori unknown) camera poses
Camera images
Easily performed with a printed pattern
• Extract features: (u, v) positions corresponding to known (x, y, z) points
• Only a sparse set of features can be extracted
• Accuracy of feature extraction is typically unknown and algorithm-dependent
Is there anything better?
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
7
What kind of data can be used for calibration?
e.g. Sturm, Ramalingam ‘03
Phase-shift coding, active patterns, … :
• From several different camera poses, produce images of a sequenceFlat screens: stable,
accurate modulation in
of calibration patterns displayed on e.g. LCD screen
O(106) points over O(1
• Sequence of values encodes (x, y, z) position of each point
m2), O(103) intensity
Several (a priori unknown) camera poses
Sequences of camera images
levels
Performed with inexpensive flat screens, stable camera mounts
•
•
•
•
Decoding: for each camera pixel (u, v), recover corresponding (x, y, z)
Dense field of decoded data, uncertainty can be accurately quantified
Proper choice of patterns makes the method robust against blur, distortions
Already used in metrology (cf. deflectometry, pattern projection)
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
8
Pattern decoding and uncertainty quantification (1)
Cosine phase-shifted active patterns:
T
i =0
i =1
i – pattern index,
A – screen brightness,
B – screen contrast,
T – spatial period
…
i = N -1
x
Displayed gray values at some position x:
gi (x) = A+ B× cos [j (x)+ yi ], where j (x) =
2p x and
2p i
yi =
T
N
Camera at some its pixel y observes:
vi (y) = C+ D× gi (x)
Allow some constant linear transform with
unknown parameters C and D, local to pixel y
x is recovered
modulo spatial
period T:
Decoding: recover x from the observed gray values
N-1
a = å vi × sin yi ,
i=0
N-1
b
b = å vi × cos yi , tan [j (x)] = - ,
a
i=0
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
T
-1 é b ù
x=
tan ê- ú + m×T,
ë aû
2p
mÎ Z
9
Pattern decoding and uncertainty quantification (2)
Cosine phase-shifted active patterns:
T
i =0
i =1
i – pattern index,
A – screen brightness,
B – screen contrast,
T – spatial period
…
i = N -1
x
All possible decoded values of x
Disambiguation of recovered values: use multiple spatial frequencies
Typical approach: merge
data by finding the closest
decoded positions
corresponding to different
frequencies (cf. ambiguity in
multi-wavelength
interferometry)
All possible decoded values of x
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
Can we do better?
10
Pattern decoding and uncertainty quantification (3)
Cosine phase-shifted active patterns:
T
i =0
i =1
i – pattern index,
A – screen brightness,
B – screen contrast,
T – spatial period
…
i = N -1
x
Decoding: recover position x from the observed gray values at camera pixel y
T
-1 é b ù
tan ê- ú + m×T
a = å vi × sin yi , b = å vi × cos yi , x =
ë aû
2p
i=0
i=0
N-1
N-1
May also determine:
• Effective contrast at
camera pixel y:
• Error in the recovered
position x:
2
B =
a2 + b2
N
*
dx =
T
2p
2 dv
× *
N B
If contrast is small, expect poor decoding
We expect that camera pixel values are
obtained with some a priori known uncertainty
δv (cf. EMVA 1288, photometric calibration, at
least 1/256)
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
11
Pattern decoding and uncertainty quantification (4)
Cosine phase-shifted active patterns:
Pattern sequence modulated
by spatial frequency f1
dp1
dx
d x1
T1
x
Model probability
distribution over x
as a Gaussian
mixture
Pattern sequence modulated
by spatial frequency f2
dp2
dx
T2
d x2
x
Merge distributions: multiply distributions (Bayes’ posterior PDF):
dp1 dp2
×
dx dx
dx
•
•
Resulting x: position of the highest-weight peak
Uncertainty of x: width of the highest peak
x
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
12
Pattern decoding and uncertainty quantification (5)
Result of Bayesian decoding: posterior x and δx. What else can we find?
May determine valid (i.e. encoded) points, remove masked or dirty pixels:
• If the final uncertainty δx is larger than some threshold, discard the point
May find the optimal coding uncertainty (= the highest useful frequency):
• Start with low-frequency patterns (large T), gradually increase pattern frequency
• Due to blurring, high-frequency patterns produce lower effective contrast B*
• Once the posterior δx after merging starts growing, stop and report x
May estimate anisotropic error (2D decoding):
• Use independent decoding for x- and y-directions on screen
• May also use many directions in x-y plane, combine into covariance matrix
Coding screen space
Camera sensor space
p = (u, v)
m = (x, y)T
T
S
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
May further extend
uncertainty quantification to
3D space
13
Pattern decoding and uncertainty quantification (6)
Result of Bayesian decoding: posterior x and δx. What else can we find?
May estimate the Gaussian blurring kernel size (in screen plane):
é 2p x
ù
g(x) = A+ B× cosê
+ yú
ë T
û
é x2 ù
1
• Normalized blurring kernel:
b(x) =
exp ê2ú
2
2W
ë
û
2p W
• Convolution (blurring) is equivalent
é 2p 2W2 ù
é 2p x
ù
to a change of contrast:
(g Ä b)(x) = A+ B× cosê
+ y ú × exp êú
2
ë T
û
T
ë
û
• Original pattern (no blurring):
Observe the contrast B* for several pattern frequencies T, fit kernel size W:
Ä
=
Blurring can be calibrated independently; not considered in this talk
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
14
Sample registration data
Sample camera image
Decoded x-coordinate on screen
Decoded y-coordinate on screen
Estimated blur kernel
Estimated inverse x-uncertainty
Estimated inverse y-uncertainty
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
15
Which camera models already exist? (1)
Basic pinhole camera: the simplest model
Hanning ’11
Beyerer, Puente Leon, Frese ‘13
Six parameters to calibrate:
“magnification”, “skewness”,
“central point”
Systematic depthdependent error
Imaging plane z = 1:
proxy of sensor space
Calibration:
• May use several known points in 3D, few camera poses
• Least-squares regression (bundle adjustment)
PCM summary:
+ Simple model, widely useful in theoretical studies
+ Closed-form, differentiable direct and inverse transforms
+ Fast rendering available
- Not flexible enough to describe realistic cameras
- Cannot describe wide-angle cameras/lenses, catadioptric
devices, imaging systems with multiple projection centers,
etc.
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
Calibration
point
Real view
ray
Estimated
ray
Common
projection
center
16
Which camera models already exist? (2)
Zhang ‘00
Pinhole model with polynomial corrections
Systematic depthdependent error
Distortions limited to the 2D
imaging plane coordinates
Extra 5-8 parameters
Imaging plane z = 1:
proxy of sensor space
Calibration data and procedure (OpenCV library):
• Static calibration pattern, sparse set of point-like features
• Efficient (semi-linear) regression to simultaneously find
intrinsic and extrinsic parameters for each pose
Calibration
point
Real view
ray
Notes:
• Uncertainty of feature position extraction is assumed uniform
and isotropic in the entire 3D volume
• Errors in estimated distortion parameters lead to systematic
errors in the measurements
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
Estimated
ray
Common
projection
center
17
Which camera models already exist? (3)
Wei, Ma ‘91
Hanning ‘11
Two-plane camera model
May avoid depthdependent error
Hanning’s calibration algorithm:
• Uses static patterns with sparse point-like features
• Semi-linear regression to estimate distortion parameters
Advantages:
+ Flexible, multi-center projection possible
+ May use splines to parameterize distortions
Calibration
point
Estimated
view ray
Plane z = 1
Drawbacks:
- Relies on global PCM: implicit systematic error
- Implicit regularization of ambiguities (e.g. frame choice)
- Uncertainty of feature position assumed to be uniform and
isotropic (Euclidean distance between points and view rays)
- Blurring effects ignored
- No available reference implementation for tests…
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
Reference
central ray
No common
projection
center for all
rays!
Plane z = -1
18
Which camera models already exist? (4)
Sturm, Ramalingam ‘03
Generic camera calibration for metrological applications
• Instead of camera mappings Pdirect and Pinverse, define one large table TGCCinverse
• Per each pixel, identify 3D ray origin and direction (6 parameters)
• Sensor space position π is completely ignored!
Calibration:
• Dense set of 3D points found for each pose from the displayed
sequence of active coding patterns (only point positions)
• Simultaneously find intrinsic and extrinsic parameters
• Per-pixel solution in closed form + optimization
Summary:
+ Very general, non-parametric, arbitrary distortions
- Accuracy of coding points assumed uniform and isotropic
- No continuity of sensor space; inter-pixel values undefined
- No fast method to project 3D points back to sensor (render images)
- Infinite sharpness assumed
- Very CPU-intensive (regression problem solved for each pixel)
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
19
Proposal: smooth generic camera calibration (1)
Generic camera calibration + smooth model parameterization
•
•
Imaging with modern cameras
is locally very smooth
No global pinhole camera
• Fit camera projection mapping as a combination of smooth kernels ϕ chosen to
account for the local smoothness of the sensor (FEM-style solution)
• Minimize the global ray consistency metric defined wrt 3D registration errors
Find the “nearest” point on ray:
Single ray consistency metric:
Single
view ray
S
Decoded calibration point
with respective 3D
uncertainty ellipsoid
Efficient closed-form solution!
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
20
Proposal: smooth generic camera calibration (2)
Generic camera calibration + smooth model parameterization
• Fit camera projection mapping as a combination of smooth kernels ϕ chosen to
account for the local smoothness of the sensor (FEM-style solution)
• Minimize the global ray consistency metric defined wrt 3D registration errors
Single ray consistency metric:
Global consistency metric:
• All components in Δ are known analytically and can be efficiently differentiated
• Can minimize Δ using e.g. Levenberg-Marquardt algorithm
sGCC calibration:
C* = argminC D,
Non-linear least
squares problem
Common vector of all intrinsic
and extrinsic parameters
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
21
Proposal: smooth generic camera calibration (3)
Generic camera calibration + smooth model parameterization
Regularization conditions: explicit recipe to fix calibration symmetries
• Unconstrained symmetries manifest themselves as extremely large eigenvalues of
covariance matrix that results from the optimization step
• May introduce different regularization conditions depending on the problem
• For example, for nearly-pinhole cameras with narrow view field may use:
oz (u, v) = 0,
rz (u, v) = 1
Fixes local re-scaling freedom
for r and the freedom for o to
move along r
z=1
ox (0, 0) = oy (0, 0) = 0,
rx (0, 0) = ry (0, 0) = 0,
¶ry
¶o
(0, 0) = x (0, 0) = 0.
¶u
¶u
z= 0
Six conditions on the six
global degrees of freedom
(choice of the camera’s SoC)
These constraints may be efficiently implemented as
linear conditions on optimization parameters
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
22
(Preliminary) simulation results
•
•
•
•
Mitsuba physically-accurate renderer, 7 camera poses, 40 patterns
Finite-aperture perspective camera (generic camera rendering not available yet)
Complete toolchain with decoding and Levenberg-Marquardt optimization
Finite elements: uniform cubic B-splines on a 10x10 mesh, 20 iterations, 20 min
Final sGCC error metric
(pose 5)
Final absolute re-projection
error (Euclidean distance)
Resulting functions rx, ry, rz
Work in progress, new results will follow…
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
23
(Preliminary) simulation results
Simulation based on synthetic data:
• Ground truth: generic camera with non-trivial
smooth distortion functions (deviation from
pinhole camera of order O(10-3) units)
• Registration data simulated directly with
Gaussian noise (no rendering / decoding)
• 3 camera poses, 1024 x 1024 data points
• 40 iterations, about 20 minutes
Ground truth functions and poses
Resulting errors in calibration functions
• Error in ox, oy: O(10-4)
• Error in rx, ry: O(10-6)
• Perhaps regularization is not precise enough?
Evolution of error metric
Eigenvalues of cov. matrix
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
24
Summary
• Camera calibration problem has a non-trivial mathematical structure
• Popular camera models and respective calibration procedures have
undesirable properties for metrological applications
We may better exploit information from active coding patterns:
• Screen pixel positions, their uncertainties, blurring kernel size
Smooth generic camera calibration:
• Universal differentiable un-biased camera model
• Explicit control over the smoothness degree of the model
• Optimization covariance and fit quality have physical interpretation!
• Explicit specification of the (problem-dependent) regularization recipe
• Efficient rendering of 3D scenes is possible (fast iterative projection)
• Blurring effects can be consistently quantified and modeled
Thank you for your attention!
Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe,
18.09.2015
25