In this post I’m going to sketch out some stable methods of computing and storing the parameters of the so-called *look-at* rotation. Everything here is more complicated than standard methods, so for it to be of any interest one or more of the following would need to be desirable:

- Directly generate a quaternion, yaw-pitch-roll or swing-twist decomposition
- Filtering of the result (say temporal smoothing)
- Dealing with regions of instability
- Branch-free
*friendly*implementations

This list makes it a a pretty narrow thing to want to do and this is partially intended to show some ways to attack moving an ill conditioned problem to a backward stable one without using any *fancy* floating-point tricks and attempting to avoid a mess of random branching to work around. Backward stable can be roughly defined as “the solution is exact for a nearby input” but here since we have a multi-component input (a unit vector) and multi-component output(s) so it’s more: “in the very small neighboorhood of the output is an exact solution for a nearby input”.

First let’s define the convention that will be used:

- A right-handed coordinate system with
- $\mathbf{x}$ is to the
*right* - $\mathbf{y}$ is
*forward*and - $\mathbf{z}$ is
*up*.

- $\mathbf{x}$ is to the

There are various standard constructions. We might start with a pair of positions for the “entity” in question. Call $\left(p_0\right)$ the entity’s position and $\left(p_1\right)$ a target position for it to face toward. Then the facing unit vector $\left(\mathbf{v}\right)$ is:

where $\| \cdot \|$ denotes normaling the contained expression. Or the unit or arbitrary magnitude vector might be directly provided. Most variants will also provide another unit vector parameter $\left(\mathbf{w}\right)$ which specifies the direction the entity considers to be *up*.

Given the convention and inputs we can compute the local *right* unit vector $\left(\mathbf{r} = \mathbf{x}’\right)$ by:

from which we can compute the local *up* unit vector $\left(\mathbf{u} = \mathbf{z}’\right)$:

## Reduced standard method

Let’s look at a standard construction with a reduced problem statement. First we’ll limit the input forward vector $\left(\mathbf{v}\right)$ to unit magnitude:

Since we can just normalize prior to calling our routine this is minor. The next limitation is bit bigger. We’re going to fix the parent up to $+\mathbf{z}$. This drastically simplifies the math and the end goal here is to produce a rotation (or decomposed rotation). Expanding with these limitations gives:

Let’s note the key properties:

- we’re producing an orthonormal basis so:
- $\mathbf{r} \cdot \mathbf{v} = \mathbf{r} \cdot \mathbf{u} = \mathbf{v} \cdot \mathbf{u} = 0$
- $\mathbf{r} \cdot \mathbf{r} = \mathbf{v} \cdot \mathbf{v} = \mathbf{u} \cdot \mathbf{u} = 1$

- the $z$ component of $\mathbf{r}$ is zero so it’s bound to the $\mathbf{xy}$-plane and can be expressed as $\mathbf{x}$ rotated about $\mathbf{z}$

The above expressions are written to reflect how the computation is going to be performed. Now a tempting change would be to replace $x^2+y^2$ with $1-z^2$ since $\mathbf{v}$ is unit but

**Danger, Will Robinson!**

we know that our input is only approximately unit and (even if carefully computed) each component will (generally) have small errors. Let’s simplify and assume that:

\[\mathbf{v} \cdot \mathbf{v} = 1\]exactly and that $y=0$ and $z=1$ then:

\[\mathbf{v} \cdot \mathbf{v} = x^2 + y^2 + z^2 = x^2 + 1 = 1\]This implies that $x^2$ is too small to contribute to an addition with one so:

\[x^2 \le u\]where $u$ is the size of the rounding unit $ \left( \frac{1}{2}\text{ulp}(1) \right) $ which is $2^{-24}$ in single precision so:

\[|x| \le 2^{-12} \approx 0.000244\]which is a *very* different result that zero. Notice that this isn’t a single precision problem and we’re in the same boat in any precision. As an example since doubles have a wider dynamic range (can represent a wider range of exponents) the issue is more pronounced. Below is a floating point number line on $\left[0,~1\right]$ where the “units” are ULP (so each “step” is the next floating point number we can represent). The top (blue) range shows the range of values that $\sqrt{x^2+y^2}$ might be when $|z|$ is approaching one and the bottom (purple) for doubles.

All of this might seems cursed if it’s unfamiliar but it allows us to represent *unit* vectors very nearly pointing in cardinal directions and performing the “correct” computation is only very important if the result is highly sensitive to the sub-result. Aside: a previous blog post on quaternion/matrix conversion shows that making these kinds of transforms can drastically increase *relative* error even in well conditioned regions.

Ahem. Anyway. Another issue is when $x$ and $y$ are both zero we have an infinite number of solution (and we’re ill conditioned on the approach) so we need to add *decision making* for that case. Let’s assume that we start in the default orientation (forward = $\mathbf{y}$ and up = $\mathbf{z}$) and are rotating toward the input giving us a $\pm \frac{\pi}{2}$ rotation about $\mathbf{x}$ where the sign that of the input $z$:

For the approach let’s consider a biased version of the above expressions (with $n = x^2+y^2 $):

If $b$ is zero then they reduce to the previous set and if $b=1$:

- $\left(n+b\right) = 1$ if $ n \le u $
- $\left(y+b\right) = 1$ if $ y \le u $

so if we conditionally set $b$ to zero or one based on some threshold value of $n$ we create deadzones in the tiny spherical caps around $+\mathbf{z}$ and $-\mathbf{z}$. The threshold should be no smaller than the square root of the smallest normal number to avoid hard to compute cases and no larger than $\sqrt{u}$ so than the magnitude of $z$ should be one and both biased terms flush to one.

Finally some noteworthy implementation details. We don’t need to carefully compute $n$ as it won’t help much alone and we’ll be pretty tight keeping everything simple. The example code below does use `fma`

but this is just for performance reasons (under the assumption that floating point contractions might be disable). Also we don’t want to perform five divisions (or even corrected multiply by reciprocal) so I’ll use the scale factor `s = sqrt(1/(n+b)`

which has the lowest average error of the simple choices.

## Euler angles Yaw-Pitch (but no Roll)

We can parameterize a unit vector using a spherical coordinate system where the radius is fixed to one and using a *mathematical* style convention of $\theta$ being the angle from $\mathbf{y}$ in the ${\mathbf{xy}}$ plane and $\phi$ the angle from ${\mathbf{xy}}$.

where the arctangent is the two parameter form (`atan2`

). We want to compute $\theta$ using `atan2`

to avoid NaNs if $|z|$ has creeped above one during computations and to get a more accurate measurement (same reasoning as above for computing $n$ as $x^2+y^2$).

If we treat these angles as the *yaw* $\left(\phi\right)$ and *pitch* as $\left(\theta\right)$ of the Euler angle (really should be called Tait-Bryan) parameterization commonly used in aviation then the resulting rotation produces our desired local right and up vectors.

So the pitch is a rotation about $\mathbf{x}$ which as a matrix is:

the yaw is a rotation about $\mathbf{z}$:

composing the two rotations gives:

and yanking out the columns give the local basis:

slapping this into toy code gives:

I haven’t bothered with introducing the small cap region since `atan2`

will manange to yield reasonable results. But notice the funky inclusion of zeroes in both inputs of the *yaw* `atan2`

. This is to normalize the result to zero when both $x$ and $y$ are zeroes where otherwise it can any of the following four depending on how the signs shake out: $\pm 0$ or $\pm \pi$. Note that this isn’t branch-free *friendly* because of the arctangent and the unit vector input requirement isn’t really needed here and if that requirement is dropped then the computation of `f`

in `yp_to_basis`

is cheaper than performing it elsewhere. We’ve already performed the bulk of the work.

Oh! And as an example of how easy it is to get $|z| = 1$ and $|n| \ne 0$ let’s peek at reconstructing $r$ with $\phi=\pi$:

```
phi = RN(pi) // 0x1.921fb6p1
r = {RN(cos(phi)), RN(sin(phi)), 0} // {-1, -0x1.777a5cp-24, 0}
```

where `RN`

is the default floating-point rounding (round to nearest, ties to even).

## Two complex decomposition not sorry for the bad pun

We can take the previous and carry through the inverse trig followed by forward trig operations:

Of course we have all the terms we needed in the vector construction and get identical expressions if we perform the
reconstruction. Vector expressions (without bias) repeated for convenience:

Other than showing the equivalence of the *Yaw-Pitch* and vector constructions it seems interesting to show the “same thing” in a different way in case anybody has an “ah ha!” moment. Plus it’s a natural stepping stone to the next section. The section name comes from we can think of this as a pair of independent unit complex numbers. One for the $\mathbf{xy}$ plane for *yaw* and the other in the $\mathbf{yz}$ plane for *pitch*.

But anyway coded up explictly this way gives:

## Swing-twist decomposition

The short version of a swing-twist decomposition decomposition (click through for more) is that we factor a rotation into a pair of quaternions. The first is a rotation about a fixed axis (here $\mathbf{z}$) and this component is termed the *twist* $\left(T\right)$. The second rotation, the *swing* $\left(S\right)$, is in the orthogonal complement (so some rotation axis in the $\mathbf{xy}$ plane). Here we only have two degrees of freedom and the twist is limited to about $\mathbf{x}$. So the Euler and swing-twist decompositions directly map to one other in our problem. In general *swing* contains both yaw and roll information. Directly plugging yaw-pitch info gives:

Recall that in quaternions we need half the angle of the desired rotation so the terms are squared. So what we want is this:

Replacing with the direct terms from the previous section:

\[\begin{align*} S^2 & = \frac{x^2+y^2}{\sqrt{x^2+y^2}} + \left(z,~0,~0 \right) \\ T^2 & = \frac{y}{\sqrt{x^2+y^2}} + \left(0,~0,~\frac{-x}{\sqrt{x^2+y^2}} \right) \end{align*}\]

with some substitutions:

and we need to apply the half-angle indentities. Starting with swing:

Equation $\eqref{s0}$ is cheaper at two square roots where $\eqref{s1}$ is one and a divide. Here we’ll use the former. Neither have stability problems and in the limit approach:

The twist expands as:

where $\eqref{t1}$ is the generally more interesting form since we can use an FMA to compute $\left(n+cy\right)$ for both a performance and accuracy win. Note we now have two ill conditioned regions instead of one. Like before when both $x$ and $y$ are approaching zero and additionally when $y$ is approaching $-1$. The $\left|y\right|=1$ cases are:

as a foreshadowing of how I’ll attack negative $y$ which is by argument reduction. The *twist* is a function of a point $p=\left(x,y\right)$ in the unit disc and we can map that point into the first quadrant $p’ = \left(\left|x\right|,~\left|y\right|\right)$, transform $p’$ and reverse the reduction. A visualization of the quadrants:

and after halving the angle with respect to $\mathbf{y}$:

If the input $x$ was negative then we reflected about $\mathbf{y}$ and if $y$ was negative then reflected about $\mathbf{x}$. To restore a negative $y$ input we need to reflect about the diagonal (angle has been halved):

So the first step can be to compute pair values $t_w$ and $t_b$ and swap if $y$ is negative:

If the input $x$ was negative we simply need to reintroduce the sign and since needed $-x$ in $\eqref{t1}$ we get (this is an assignment and using an arrow looked ugly to me):

After performing the steps $\eqref{ts}$ and $\eqref{ts2}$ the twist becomes:

Finally we need to tweak the bias application. To get a one in the denominator we need to bias $n$ by $\frac{1}{2}$ when in the deadzone and the $y$ in $t_w$ by one.

## Quaternion

We can get the *direct-to-quaternion* formulation from the swing-twist by performing the composition:

Taking $\eqref{s1}$ and $\eqref{ts3}$, replacing the denominators by $a$ and $b$ respectively and performing the product gives:

and expanding $ab$:

The term inside the root could be expanded but I’ll just leave as is and compute it and the other two $\left(1+c\right)t$ terms as $t+ct$ using *FMA*.