16 - Camera Extrinsics & Intrinsics

@October 27, 2022

Goal:

map from world coordinates → camera coordinates

project onto proj plane

map from projection plane → pixel coordinates

RECALL: rotation

1. Map from World Coordinates → Camera

u, v, n are the camera coordinates

⭐

wanna describe dist from

c_w

and

X_w

the vector = $X_w-C_w$

then have a rotation matrix & rotate 𝑿𝒘 − 𝑪𝒘 from world coordinates to camera coordinates.

write in orthonormal camera coord

using homogeneous coordinates… ⬇️

2. Project onto Projection Plane

similiar triangle in lecture 12

we get the 2D point in projection plane

3. Map from projection plane coordinates → pixel coordinates

units of $(x,y)$  are the SAME as units of world & camera coordinates
- e.g. millimeters

→ transform them to pixel index coordinates

2 issues to concern..

where is the center? precision problem
- Apply a translation

units are different (mm versus pixel indices)
- Apply a scale transformation

FOR intrinsic parameters, need to know about focal length

Issue 1: Units are different → scale transformation

wanna know how many pixels per milimeter

⭐

NOTE: spacing between x can be different as spacing in y

⬆️ above scale transformation in 2D homoenous coordinates:

✅

Issue 2: Find the pixel center → translation

$(p_x, p_y)$  denotes the center of the pixel! corresponds to (𝑥, 𝑦) = $(0,0)$ 
→ “principal point”
⭐
It might not correspond exactly to the center of the pixel grid.

Let’s put STEP 2 & 3 together…

Recall:
step 2: project to proj plane
step 3: map to pixel coordinates

“Camera Calibration Matrix `K`”

is the green part

Assume we are starting out with a point in camera coordinates,

and written in homogeneous coordinates $(𝑋, 𝑌, 𝑍, 1)^T$ 

**3x3 part is invertible but the 3x4 matrix overall is not invertible since it includes projection.**

Why only one part invertible？

→ because u can of course invert from pixel coordinates to camera coordinates

→ but u cannot invert back to 3D coordinates before projection

⭐

RECALL: in linear algebra, u want to scale AND THEN translate

ASIDE: A more general model: adding a `shear`

this shear is often very small, think of grid as a bit tilted… slightly

Conclusion: Intrinsic & Extrinsic

a sequence of transformations

extrinsic:

from 4D homogeneous coordinate,
- translate to get rid of camera position in world coordinates
- rotate to get to camera coordinate

intrinsic:

once in camera coordinate,
- multiply by camera calibration matrix K

….. in the end we get a 3x4 matrix..

Exercises:

💡

NOTE: think of P as a point that

takes a point from 3D world to 2D point in camera pixel

Q1:

ANSWER: the camera position (b/c it does not change at all)
- in step 1 we get rid of camera position = =[0 0 0]
- then [0 0 0] times ANYTHING is [0 0 0]
  - → so it must give the null space!

Q2:
ANSWER: can think of “how to isolate the 1st column?”
we need $[X,Y,Z,1]^T=[1, 0,0,0]^T$
what does this new matrix mean? geometric
→ a point at infinity in x direction
💡
recall….

ANSWER: to isolate 4th column
→ origin
ANSWER: think of the red equation
which plane is iit?
Ans: the plane in world that maps to position $x=0$  & containing the camera center

ANSWER: camera coordinates
Ans: the plane in world containing camera center & normal is optical axis
the 3-vector (P3,1, P3,2, P3,3) must be in the direction of the optical axis of the camera, - since it’s normal to principal plane