Part 1: Ray tracing: Computing ray-circle intersections
Part 2: Ray tracing: Computing ray-sphere intersections
(Code for this post is available on GitHub.)
Whereas the previous post was about 2D ray-circle intersection, this one is about 3D ray-sphere intersection. We'll extend the circle standard equation to a sphere, and switch from scalars to vectors for defining both lines and spheres. Using vectors, we can generalize and perform intersection calculations for any dimension in a succinct manner.
Except for the added third dimension, the equation of a sphere centered at a position \(c\) is identical to that of the circle. Its equation still follows from the Pythagorean theorem: $$(x - c_x)^2 + (y - c_y)^2 + (z - c_z)^2 = r^2$$ With the sphere centered at origin \((0,0,0)\), the \(c\) terms disappear and the equation becomes $$x^2 + y^2 + z^2 = r^2$$
The \(y = ax + b\) definition of a line in 2D doesn't generalize to 3D. In 2D the equation expresses that a change in one dimension may cause a change in at most one other dimension. In 3D, however, a change in one dimension may cause a change in two other dimensions.
Instead, in 3D a line is expressed as a starting position \(p_0\) and an amount of movement \(t\) in some direction \(p_1\). Statements about position and direction are best expressed in terms of vectors and operations on vectors: $$\vec p = \vec{p_0} + t\vec{p_1}$$ Different values of \(t\), for time, result in a line fixed at origin and a position vector pointing at different points on the line (a position vector and a point is synonymous). Different values of \(t\) has the effect of scaling the position vector along the line, allowing us to get to any position on the line.
In ray tracing we're only interested in the half-line where \(t > 0\). That half-line represents a ray traveling from the camera into the scene, and is why we refer to it as a ray instead of a line. \(t = 0\) would be the position of the camera and \(t < 0\) a ray traveling out back of the camera.
As an example, we can determine the parametric equation of the line passing through \((1,-1,4)\), the position of the camera, in the direction of \((3,5,-1)\), a scene object. We do so by substituting the two vectors into the vector equation for a line: $$\begin{eqnarray*} \vec p = (1,-1,4) + t(3,5,-2)\\ \vec p = (1,-1,4) + (3t,5t,-2t)\\ (x,y,z) = (1+3t,-1+5t,4-2t) \end{eqnarray*}$$ For ray-sphere intersections, we're after values of \(t\) describing points of intersection. We then substitute those into the parametric equation of a line to get the \((x,y,z)\) coordinate of intersections.
As an example, suppose we have on standard form the parametric equation of a line \((10 + 2t, 5 + t, 2)\) and the equation of a sphere \(x^2 + y^2 + z^2 = 9\). As with 2D intersections, we're after values for \(x\), \(y\), and \(z\) that when substituted into both make their equality hold: $$\begin{eqnarray*} (10 + 2t)^2 + (5 + t)^2 + (2)^2 = 9 & & \textrm{\{substitute in line equation\}}\\ (10 + 2t)(10 + 2t) + (5 + t)(5 + t) + 4 = 9 & & \textrm{\{expand squares\}}\\ 100 + 20t + 20t + 4t^2 + 25 + 5t + 5t + t^2 + 4 = 9 & & \textrm{\{expand parenthesis\}}\\ 5t^2 + 50t + 129 = 9 & & \textrm{\{simplify and arrange in \(t^2\), \(t\), and constants order\}}\\ 5t^2 + 50t + 120 = 0 & & \textrm{\{move 9 to left side to arrive at the standard form of quadratic equation\}} \end{eqnarray*}$$ We went from one sphere equation with three unknowns, \(x\), \(y\), and \(z\) to a quadratic equation with one unknown \(t\). Solving for \(t\) means applying the same standard method as earlier: $$\begin{eqnarray*} t = \dfrac{-b \pm \sqrt{d}}{2a} \textrm{, where \(d = b^2 - 4ac\)}\\ d = 50 - 4(5)(120) = 100\\ t_1 = \dfrac{-50 + \sqrt{100}}{2(5)} = \dfrac{-50 + 10}{10} = \dfrac{-40}{10} = -4\\ t_2 = \dfrac{-50 - \sqrt{100}}{2(5)} = \dfrac{-50 - 10}{10} = \dfrac{-60}{10} = -6\\ \end{eqnarray*}$$ Substituting these \(t\) values into the parametric vector equation of the line, we arrive at the points of line-sphere intersection: $$\begin{eqnarray*} \vec{p_1} = (x_1,y_1,z_1) = (10 + 2(-4), 5 + (-4), 2) = (2,1,2)\\ \vec{p_2} = (x_2,y_2,z_2) = (10 + 2(-6), 5 + (-6), 2) = (-2,-1,2) \end{eqnarray*}$$
For additional details on calculating line-sphere intersection, refer to this video.
Above we established the standard vector equation for a line. For a sphere centered at \(c\) with radius \(r\), its corresponding standard vector equation is below: $$(\vec p - \vec c) \cdot (\vec p - \vec c) = r^2$$ \(\vec p\) represents every position vector on the sphere and \(\cdot\) is the dot product of two vectors. The term product in general refers to the result of multiplication, but unlike the single product definition for scalars, vectors have both dot product and cross product. We're only interested in the dot product, defined as the sum of the products of the corresponding entries, making the result a scalar, not another vector.
In the equations to follow, whether scalar product or dot product is meant can always be inferred based on whether both operands are scalars or both are vectors. So strictly speaking, we don't need the \(\cdot\) symbol, but it's convention.
At first sight, the standard vector equation of a sphere looks nothing like its scalar counterpart. But note the dot product's definition: $$\vec u \cdot \vec v = {u_x}{v_x} + {u_y}{v_y} + {u_z}{v_z} = r^2$$ If \(\vec{u} = \vec{v}\), as is the case for the vector sphere definition, expanding the dot product turns it into the scalar standard equation of a sphere. So while at first sight the dot product and sphere seem unrelated, defining the sphere in terms of the dot product is a handy shortcut: $$\vec u \cdot \vec u = {u_x}{u_x} + {u_y}{u_y} + {u_z}{u_z} = u_{x}^2 + u_{y}^2 + u_{z}^2 = x^2 + y^2 + z^2 = r^2$$ As with the scalar solution to computing intersections, where we substituted the line equation into the circle equation, we can replace \(\vec p\) in the sphere equation by the vector equation of the line (don't be confused by the use of \(\vec p\) in both definitions. Those are different \(\vec p\)s): $$(\vec{p_0} + \vec{p_1}t - \vec c) \cdot (\vec{p_0} + \vec{p_1}t - \vec c) - r^2 = 0$$ Through rewriting and simplification, the goal is to turn this complicated looking equation into a quadratic one and solve for \(t\). For an actual line and sphere, \(\vec{p_0}\), \(\vec{p_1}\), \(\vec c\), and \(r\) are known quantities, leaving \(t\) the only unknown:
$$\begin{eqnarray*} (\vec{p_0} \cdot \vec{p_0} + \vec{p_0} \cdot \vec{p_1}t - \vec{p_0} \cdot \vec c) + (\vec{p_1}t \cdot \vec{p_0} + \vec{p_1}t \cdot \vec{p_1}t - \vec{p_1}t \cdot \vec c) + (-\vec c \cdot \vec{p_0} - \vec c \cdot \vec{p_1}t + \vec c \cdot \vec c) - r^2 = 0 & & \textrm{\{expand dot product into 3 * 3 + 1 terms\}}\\ \vec{p_0} \cdot \vec{p_0} + \vec{p_0} \cdot \vec{p_1}t - \vec{p_0} \cdot \vec c + \vec{p_1}t \cdot \vec{p_0} + \vec{p_1}t \cdot \vec{p_1}t - \vec{p_1}t \cdot \vec c - \vec c \cdot \vec{p_0} - \vec c \cdot \vec{p_1}t + \vec c \cdot \vec c - r^2 = 0 & & \textrm{\{remove parenthesis included as a visual aid\}}\\ \vec{p_0} \cdot \vec{p_0} + \vec{p_0} \cdot \vec{p_1}t - \vec{p_0} \cdot \vec c + \vec{p_1}t \cdot \vec{p_0} + t^2(\vec{p_1} \cdot \vec{p_1}) - \vec{p_1}t \cdot \vec c - \vec c \cdot \vec{p_0} - \vec c \cdot \vec{p_1}t + \vec c \cdot \vec c - r^2 = 0 & & \textrm{\{factor out \(t\) in term where it appears twice\}}\\ t^2(\vec{p_1} \cdot \vec{p_1}) + t(\vec{p_0} \cdot \vec{p_1} + \vec{p_1} \cdot \vec{p_0} - \vec{p_1} \cdot \vec c - \vec c \cdot \vec{p_1}) + \vec{p_0} \cdot \vec{p_0} - \vec{p_0} \cdot \vec c - \vec c \cdot \vec{p_0} + \vec c \cdot \vec c - r^2 = 0 & & \textrm{\{factor out \(t\) in terms where it appears once\}}\\ t^2(\vec{p_1} \cdot \vec{p_1}) + t\vec{p_1}(\vec{p_0} + \vec{p_0} - \vec c - \vec c) + \vec{p_0} \cdot \vec{p_0} - \vec{p_0} \cdot \vec c - \vec c \cdot \vec{p_0} + \vec c \cdot \vec c - r^2 = 0 & & \textrm{\{factor out common \(\vec{p_1}\) in \(t(...)\) part\}}\\ t^2(\vec{p_1} \cdot \vec{p_1}) + t\vec{p_1}(2\vec{p_0} - 2\vec c) + \vec{p_0} \cdot \vec{p_0} - \vec{p_0} \cdot \vec c - \vec c \cdot \vec{p_0} + \vec c \cdot \vec c - r^2 = 0 & & \textrm{\{group similar element inside \(t(...)\) part\}}\\ t^2(\vec{p_1} \cdot \vec{p_1}) + 2t \vec{p_1} \cdot (\vec{p_0} - \vec c) + \vec{p_0} \cdot \vec{p_0} - \vec{p_0} \cdot \vec c - \vec c \cdot \vec{p_0} + \vec c \cdot \vec c - r^2 = 0 & & \textrm{\{apply distributive law inside \(t(...)\) part: \(ab - ac = a(b - c)\) and remove extra parenthesis\}}\\ t^2(\vec{p_1} \cdot \vec{p_1}) + 2t \vec{p_1} \cdot (\vec{p_0} - \vec c) + \vec{p_0}^2 + \vec{c}^2 - 2\vec{p_0} \cdot \vec{c} - r^2 = 0 & & \textrm{\{group similar elements in constants part\}}\\ t^2(\vec{p_1} \cdot \vec{p_1}) + 2t \vec{p_1} \cdot (\vec{p_0} - \vec c) + (\vec{p_0} + \vec c)^2 - r^2 = 0 & & \textrm{\{apply binomial square law to constants part: \(a^2 + b^2 - 2ab = (a - b)^2\)\}}\\ \end{eqnarray*}$$ Or in terms of the standard form of a quadratic equation: \(at^2 + bt + c\), we get \begin{eqnarray*} a = \vec{p_1} \cdot \vec{p_1}\\ b = 2\vec{p_1} \cdot (\vec{p_0} - \vec c)\\ c = (\vec{p_0} + \vec c)^2 - r^2 \end{eqnarray*} Because the dot product of two vectors results in a scalar, once we substitute in the known elements (\(\vec{p_0}\), \(\vec{p_1}\), \(\vec c\), and \(r\)) and simplify, we're left with a quadratic equation.
Continuing with the example from above in which we have \(\vec{p_0} = (10,5,2)\), \(\vec{p_1} = (2,1,0)\), \(\vec c = (0,0,0)\), and \(r^2 = 9\): $$\begin{eqnarray*} t^2((2,1,0) \cdot (2,1,0)) + 2t(2,1,0) \cdot ((10,5,2)-(0,0,0)) + ((10,5,2) + (0,0,0) \cdot ((10,5,2) - (0,0,0)) - 3^2 = 0\\ t^2(5) + 2t(2,1,0) \cdot (10,5,2) + ((10,5,2) \cdot (10,5,2)) - 9 = 0\\ t^2(5) + 2t(2,1,0) \cdot (10,5,2) + (129) - 9 = 0\\ t^2(5) + 2t(25) + 120 = 0\\ t^2(5) + t(50) + 120 = 0\\ 5t^2 + 50t + 120 = 0 \end{eqnarray*}$$ which shows that the scalar and vector editions yield identical results.
The simplification steps require memorization of rules. Or have a site like Symbollab.com assist in identifying a rule by name and with simplification.
While the quadratic equation with \(a\), \(b\), and \(c\) above is mathematically correct, it may be further optimized for computer execution. For one, the \(\vec{p_0} - \vec c\) expression is computed twice: once for \(b\) and once for \(c\). Extracting it into \(\mathrm{oc}\) for distance between origin and center yields: \begin{eqnarray*} \mathrm{\vec{oc}} = \vec{p_0} - \vec c\\ a = \vec{p_1} \cdot \vec{p_1}\\ b = 2\vec{p_1} \cdot \mathrm{\vec{oc}}\\ c = (\vec{\mathrm{oc}})^2 - r^2 \end{eqnarray*}
Another more elaborate optimization involves getting rid of coefficients during computation of roots, saving three multiplication operations for each root computation. We start by defining \(d\), not to be confused by the discriminant as $$\begin{eqnarray*} d = \vec{p_1} \cdot \mathrm{\vec{oc}} \end{eqnarray*}$$ From the definition of \(b\) it follows that \(b = 2d\). $$\begin{eqnarray*} \dfrac{-b \pm \sqrt{b^2 - 4ac}}{2a} & & \textrm{\{Substitute \(b\) with \(2d\)}\}\\ \dfrac{-2d \pm \sqrt{(2d)^2 - 4ac}}{2a} & & \textrm{\{Apply exponential rule: \((a \cdot b)^n = a^{n}b^n\)}\}\\ \dfrac{-2d \pm \sqrt{4d^2 - 4ac}}{2a} & & \textrm{\{Factor out common term 4}\}\\ \dfrac{-2d \pm \sqrt{4(d^2 - ac)}}{2a} & & \textrm{\{Apply radical rule: \(\sqrt[\uproot{3}n]{ab} = \sqrt[\uproot{3}n]{a} \sqrt[\uproot{3}n]{b}\), assuming \(a \geq 0\), \(b \geq 0\)}\}\\ \dfrac{-2d \pm 2\sqrt{d^2 - ac}}{2a} & & \textrm{\{Factor out common term \(2\)}\}\\ \dfrac{2(-d \pm \sqrt{d^2 - ac})}{2a} & & \textrm{\{Divide the numbers: \(\frac{2}{2} = 1\)}\}\\ \dfrac{-d \pm \sqrt{d^2 - ac}}{a} & & \textrm{}\\ \end{eqnarray*}$$ The result is a new formula for computing roots specific to \(a\), \(b\), \(c\) as derived for ray-sphere intersection. In summary, here're the final definitions for efficient use in code: \begin{eqnarray*} \mathrm{\vec{oc}} = \vec{p_0} - \vec c\\ a = \vec{p_1} \cdot \vec{p_1}\\ b = \vec{p_1} \cdot \mathrm{\vec{oc}}\\ c = (\vec{\mathrm{oc}})^2 - r^2\\ \textrm{discriminant} = b^2 - ac \end{eqnarray*}
In computer graphics, we almost always want equations expressed in terms of vectors. It gives us the ability to treat the \(x\), \(y\), and \(z\) components of a vector as a single unit, making for shorter and clearer equations. Similarly, defining and applying vector operations in code make it shorter and less error-phone. It's like in object oriented programming when we create a class to group related data and operations.
Having now derived the ray-sphere intersection equation, we have the mathematics in place to create a first ray tracer; one that shoots rays from the camera, through each pixel, and into the scene where it potentially intersects with a sphere.