The assignment was:
Using "the one true way" implement a model of a 35mm camera with a 15mm
focal-length lens.
Use your camera to image points arranged at the corners and edge-centers
of a box about the
size of a cereal box (20 points in all, about 12 inches tall, 8 inches
wide, 3 inches thick).
Antialias the images of the points. Make a short movie, animated GIF,
or sequence of still pictures
showing your camera viewing the points from a gradually changing position.
The code:
(these are actually text files, as I couldn't figure out how to make the
web browser view the
matlab files).
Pifw.m
Splat.m
gauss2d.m (Bill Baxter,'95)
make_movie.m
The vector function that I had the camera follow for the below demo is:
<40*cos(t) , 40*sin(t) , 4*t + 50*cos(t)^2 - 40> as t varies
from about 0 to 10.
with the cereal box somewhere in the middle.
Additionally, the camera always faces the origin.
Assn1demo2.avi is the demo.
Assn1demo3.avi is another. Enjoy!
(the avi's plays haltingly unless you save to disk.)
My thoughts on the second part of the assignment.
Imagine the system that leads to some frame. The
way I see it, as the eye moves
away from the object, the object's projection on the screen becomes
smaller. But
that is only if the image plane vectors a and b remain constant.
If we have the focal
length be fixed, and make the magnitudes of a and b vary inversely
with the distance
from the eye to the origin, as we move along a line that passes through
the origin
(where the camera is facing), the image would not change.
So, if we're given 3d coordinates and the (normalized)
2d coordinates they map
to on the image plane, I think that the projection matrix and center
of projection can
be determined up to a constant, or distance from the origin.
The answer would be
of the from "It's along this line (positive and negative scalers times
a direction vec-
tor) that you could expect an image like that." And having a
sequence of images
doesn't help, unless we are given the distance the eye travels over
two consecutive
frames. If we had that, we could quantify the parallax (does
that even make sense
as a phrase) and triangulate the center of projection used in each
image. It may be
a different story however, if the screen coordinates are provided before
division
by w (or t or whatever).
But disregarding eye coordinates and scale for a
moment. Three linearly
independent vectors in 3-space define the cereal box. These three
can further be
thought of as combinations of the orthonormal basis set. The
projective 3x3
matrix has as its column vectors the projective transforms of this
orthonormal
basis set. So getting a 3x3 that accomplishes the projective
transform of the
cereal box is trivial (or am I thinking only of parallel projection).
Then, the kernel
of this transformation is all points that get transformed to the origin
in the plane
(<0,0,1>), which is all points along the line (through the origin)
that the center of
projection lies on. And that's a start at least...
Once we've determined the 3x3 and the line along
which the COP lies, it
would seem that a little more work and we could determine focal length
and the
a and b vectors, which would in turn give us the exact COP, finishing
the problem.
But our matrix represents a linear transformation from 3 to 2 space.
It does not
have an inverse (it's inverse would have as columns a b and o).
Oh that hurts!
But thinking about the orthogonal projection, parallel lines remain
parallel on the
screen, and parallel lines of equal length have equal length in the
projected image
too. So once we have the 3x3, maybe we could plug in test cases
(or pull them
from the cereal box data) to see just how skewed the images are, which
could
tell us about focal length. And you could also determine the
directions at least
of the a and b vectors that span up the image plane. But barring
the parallax
information being given (above), I don't think we could pull the exact
COP and
the a and b magnitudes out of the information. For his PhD work
though, I
think Wei-Chao
Chen knows otherwise, muting my ignorant pessimism.