Lei Mao bio photo

Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

Twitter Facebook LinkedIn GitHub   G. Scholar E-Mail RSS


In my previous blog post “2D Line Mathematics Using Homogeneous Coordinates”, we have discussed the 2D line mathematics.

In this blog post, I would like to discuss the 3D line mathematics from a similar perspective.

3D Point Representations

Inhomogeneous Coordinates

The inhomogeneous coordinates for a 3D point are just ordinary two-value Cartesian coordinates.

\[\mathbf{x} = (x, y, z)\]

Augmented Coordinates

The augmented coordinates for a 3D point are just the 3D inhomogeneous coordinates with an additional constant $1$.

\[\bar{\mathbf{x}} = (x, y, z, 1)\]

Homogeneous Coordinates

The homogeneous coordinates are just the augmented coordinates scaled by some value $\tilde{w}$.

\[\begin{align} \tilde{\mathbf{x}} &= \tilde{w} \bar{\mathbf{x}} \\ &= \tilde{w} (x, y, z, 1) \\ &= (\tilde{w}x, \tilde{w}y, \tilde{w}z, \tilde{w}) \\ &= (\tilde{x}, \tilde{y}, \tilde{z}, \tilde{w}) \\ \end{align}\]

where $\tilde{w} \in \mathbb{R}$.

When $\tilde{w} = 0$, $\tilde{\mathbf{x}}$ is called ideal point and do not have the corresponding inhomogeneous coordinates.

3D Line Representation

Given two points $(\mathbf{p}, \mathbf{q})$ represented using inhomogeneous coordinates, on a 3D line, by vector algebra, we could represent any point $\mathbf{r}$ on the 3D line as a linear combination of the two points.

\[\begin{align} \mathbf{r} &= \mathbf{p} + \lambda (\mathbf{q} - \mathbf{p}) \\ &= (1 - \lambda)\mathbf{p} + \lambda \mathbf{q} \end{align}\]

Because any point on the 3D line can be represented using the expression above, the 3D line could also be represented using the exactly same expression.

Given one point $\mathbf{p}$ on a 3D line and its direction vector $\mathbf{d}$ represented using inhomogeneous coordinates, by vector algebra, we could represent any point $\mathbf{r}$ on the 3D line

\[\begin{align} \mathbf{r} &= \mathbf{p} + \lambda \mathbf{d} \\ \end{align}\]

More concretely, $\mathbf{d}$ is just the $\mathbf{q} - \mathbf{p}$ in the previous expression using two points.

In terms of 3D line representation using homogeneous coordinates representation, we could derive it from the 3D line representation using inhomogeneous coordinates representation

\[\begin{align} (\mathbf{r}, 1) &= \frac{1}{w_r} \tilde{\mathbf{r}} \\ &= \Big((1 - \lambda)\mathbf{p} + \lambda \mathbf{q}, 1\Big) \\ &= \Big((1 - \lambda) \frac{1}{w_p} \tilde{\mathbf{p}} [:2] + \lambda \frac{1}{w_q} \tilde{\mathbf{q}} [:2], 1\Big) \\ &= (1 - \lambda) \frac{1}{w_p} \tilde{\mathbf{p}} + \lambda \frac{1}{w_q} \tilde{\mathbf{q}} \\ \end{align}\]


\[\begin{align} \tilde{\mathbf{r}} &= (1 - \lambda) \frac{w_r}{w_p} \tilde{\mathbf{p}} + \lambda \frac{w_r}{w_q} \tilde{\mathbf{q}} \\ \end{align}\]

Degrees of Freedom

It is not difficult to find that in order to determine a 3D line using the above two-point representations, we need 6 parameters. However, in fact, 3D line only has 4 degrees of freedom (DOF), and we only need 4 parameters to determine a 3D line explicitly.

Any 3D line is tangent to a unique sphere centered at the origin with a specific direction. The sphere could be determined by the radius $r$. The point $\mathbf{m}$ where the 3D line and the sphere meet could be determined by two additional angles $(\theta, \phi)$, $\mathbf{m} = (r \cos \theta \cos \phi, r \sin \theta \cos \phi, r \sin \phi)$. Finally, because the plane where $\mathbf{m}$ sits and is tangent to the sphere is unique, the orientation of the 3D line will be determined by one additional angle $\omega$. Taken together, a 3D line only has 4 degree of freedom, and a 3D is determined uniquely by 4 parameters.

However, it does not necessary mean the 6-DOF 3D line representation is inferior to the 4-DOF 3D line representation. It really depends on the problem we are going to solve. In many scenarios, the 6-DOF 3D line representation is very convenient for problem solving.

Another way to think about the 3D line DOF is using the two-point 6-DOF representations and the translational invariance. Because

\[\begin{align} \mathbf{r} &= (r_x, r_y, r_z) \\ &= \mathbf{p} + \lambda \mathbf{d} \\ &= (p_x, p_y, p_z) + \lambda (d_x, d_y, d_z) \\ \end{align}\]

We have the following equation.

\[\lambda = \frac{r_x - p_x}{d_x} = \frac{r_y - p_y}{d_y} = \frac{r_z - p_z}{d_z}\]

Because the translational invariance, the value of $(d_x, d_y, d_z)$ cannot change freely. If $d_x$ is determined, $d_y$ and $d_z$ are determined as well. For example, suppose $\mathbf{p} = (p_x, p_y, p_z) = (1, 2, 3)$, and we find a point $\mathbf{r}$ on the 3D line $\mathbf{r} = (r_x, r_y, r_z) = (5, 4, 9)$. So the equation becomes

\[\begin{align} \lambda &= \frac{r_x - p_x}{d_x} = \frac{r_y - p_y}{d_y} = \frac{r_z - p_z}{d_z} \\ &= \frac{5 - 1}{d_x} = \frac{4 - 2}{d_y} = \frac{9 - 3}{d_z} \\ &= \frac{4}{d_x} = \frac{2}{d_y} = \frac{6}{d_z} \\ \end{align}\]

We are allowed to choose $d_x$ freely, say $d_x = 2$. But once $d_x$ is determined, we cannot choose $d_y$ and $d_z$ freely. In this case, when $d_x = 2$, we must have $d_y = 1$ and $d_z = 3$. Therefore, we only have 1 DOF for $(d_x, d_y, d_z)$, and the DOF for a 3D line is $3 + 1 = 4$.