COMP290-89-Final

Final Project: Calibrating a Hyperbolic Mirror Camera.

Inspired by "Calibration of a fish eye lens with field of view larger than 180", CVWW 2002
Hynek Bakstein and Tomas Pajdla

pictures of the setup. Top row: The hypothosized hyperbolic camera, and the calibration
structure, a cylinder with known pattern. Bottom, the calibration photo used (original size:
1920x2560).

CODE

In this project, I designed a model to determine the parameters of a 360 degree view
camera attachment. I will begin by presenting the basic algorithm for optimization (due
to the Bakstein paper mentioned above). Then I will present my contribution, which
is a different non-linearity in the reprojection step. Results will then be presented.

The Setup.

The setup is as follows. In order to calibrate a camera with an image, we must know
the 3D structure of the scene imaged. So we build a case where we know points in
3 space and we know the corresponding imaged points on the calibration image. I built
a hollow cylinder with a measured grating pattern on the inside. Measuring the radius
of the cylinder, I then calculated the 3D coordinates of the points, relative to the point
along the cylinder axis where I thought the center of projection of the camera was. The
origin of the 3D coordinate space was chosen in order to have an easier initialization (a
good initialization is necessary for most objective functions).

A model is then used, with some number of parameters, to project (with the model) the
known 3D points. The projected points should closely match the points we've identified
in the image, if our model and parameters are correct, as well as up to inaccuracies in
our identified points (both in 2D and 3D). The Levenberg-Marquardt algorithm is used
for choosing the optimal parameters for the model, in terms of minimizing the error
associated with a projection, where the error is simply the sum of the pixel differences
between the projected and calibrated points (the ones the user defines to be the images
of the 3D points).

The Model.

The actual projection model I used is defined here. We begin projection by
reorienting the the world coordinates into the camera coordinate system, via a rotation
and translation,

X' = R*X + T
where R and T have 3 degrees of freedom each. The X' point is then non-linearly
projected via hyperbola (described below) to a new point in 3D in the camera coordinate
system. This point has the relation to the camera shown here.

Computing the coordinates (x',y',z'), we can derive theta. We then use a stereographic
projection r = k*tan(theta/2) to get the radius of the point (distance to the image point
that is on the optical axis). The phi in the image plane is also computed from (x',y',z'),
and (u',v',1) in orthogonal image coordinates is then determined. Finally, this image
point is mutliplied by an intrinsics matrix K, with skew assumed zero.

As can be seen above, u_0 and v_0 are the optical axis point, and beta is a scalar
measuring the squareness of the pixels. The final reprojected image points are then
given by K*(u',v',1).

The hyperbolic projection in the middle of all of this is simplified by the assumption
that the shape of the mirror is rotationally invariant. With this assumption, the
reflection off of the mirror can be computed in the plane containing the optical axis
and the point, for every point indepedently. Think of the coordinate X' above,
immediately after the rotation and translation. Consider the interaction in the plane
where the optical axis is the y coordinate, and the x is given by the point's distance
to the optical axis, sqrt(x^2 + y^2).

Cylindrically then, the coord of the point is
(sqrt(sum([x y].^2)), z) = (Q,z). The camera is at
(0,0) and the focus of the hyperbola (for which the
camera is not the focus) is at (0,2c). The equation
then that solves for being on the hyperbola
subject to being on the line between (0,2c) and the
point in space is:
hyperbola params: a, c.
on pt = (0,2c) + t*(Q, z - 2c) t in [0,1] and
dist((0,0),pt) - dist((0,2c),pt) == 2a
solve for t and get perhaps two solutions. The one
I've found to always be on the correct hyperbola is

(-a^2+c^2)*(-2c^2+cz+ a*sqrt(4c^2+Q^2-4cz+z^2))
t = --------------------------------------------------------------------
-c^2*(-2c+z)^2 + a^2*(4c^2+Q^2-4cz+z^2)

Thusly, t is computed and the point in the plane that is actually viewed (namely,
(0,2c) + t*(Q, z - 2c)) is determined. Theta is computed and the projection
proceeds as specified above.

Some quick parameter counting before we move to results. There are 6
parameters in the initial affinity and 3 in the K matrix. The c and a variables
determine the spread of the hyperbola, and the k scalar used in stereographic
projection defines the number of pixels per tangent unit (whatever those are).
The total is then 12.

The Results.

There were 50 calibration points that I selected in the image. The parameters
found were in the best case (the initial values are in the code):

x,y,z - rotations (in radians): 0.015273 -0.014428 -0.090647
x,y,z - translations (in .5 inch units): 0.167028 0.325886 -0.422814
u_0, v_0 (in pixels), beta: 961.58 1305.84 1.008926
c, a (in .5 inch units), k: 3.97223 4.32154 6500.14

Obviously, the rotation and translation are specific to this image and so are
probably not applicable to the mechanism--though one never knows where
intuitively the optimization will go in calibrating with a complicated objective
space (some of the slack in a less-sensitive variable may be hidden in another,
more sensitive one).

figure showing the errors on the image (left), and a graph of the pixel error verses point
number. Notice that the error does not show systematic dependence. The errors on
the left have been multiplied by 5, for better visibility. The median error is 9.4 pixels.

To get the c, a and k specified above, I reoptimized with respect to k, then c
and a. The non-linear optimization would not move the variables much as long
as they were within a reasonable range. This led me to believe the objective
function is bumpy. So I wrote the Optimize_* code (see top) to test a large
range of initial values and choose the one with the best sum-of-square distances
of the reprojection error. The non-linear optimization of the best initial value was
then the final answer. This gave some improvement to the errors. Also, running
the steps again (k with the best new c and a) did not improve things (more
justification for a flat bumpy objective space).

I feel the majority of errors can be attributed to my selection of points.
First, the points in three space may not be very accurately known. They were
all selected to be on circular cuts through the cylinder, but there may be slight
variations that through the high-resolution (and somewhat blurred) photo
(1920x2560), and after hyperbolic reflection (and with the optical axis perhaps
a bit off center), become a many-pixel difference from circular in the image. So
I end up with a calibration where the declared 3D points would lead to concentric
circles, but their images (and certainly my point selection) weren't such circles.

figure, two example errors. On the left is my hand selection, and on the right is the best fit
model. One could argue that my point selection is wrong by at least half the error in the best
reprojection. At such blur levels, I feel this is a better measure than pixel distances (of
course, anyone with bad pixel differences says that).

Another assumption made, that I feel was reasonable, was that the center of
projection of the camera was at a focus of the hyperbola (this is implicit in
having the camera at (0,0) and the focus of the other half-hyperbola at (0,2c)).

A final attempt at a slightly better error was to take the optimum found using
all the points, and use those points that were inliers and start over. Of course,
this should converge to the same solution, and it did, with slightly lower error.

Ideally, there should be more calibration points. Then one could optimize over
the hand placement while purging outliers. Overall however, I feel the results are
decent and the model of the 360 camera setup is reasonable.