In this post I’m going to sketch out some stable methods of computing and storing the parameters of the so-called look-at rotation. Everything here is more complicated than standard methods, so for it to be of any interest one or more of the following would need to be desirable:
- Directly generate a quaternion, yaw-pitch-roll or swing-twist decomposition
- Filtering of the result (say temporal smoothing)
- Dealing with regions of instability
- Branch-free friendly implementations
This list makes it a a pretty narrow thing to want to do and this is partially intended to show some ways to attack moving an ill conditioned problem to a backward stable one without using any fancy floating-point tricks and attempting to avoid a mess of random branching to work around. Backward stable can be roughly defined as “the solution is exact for a nearby input” but here since we have a multi-component input (a unit vector) and multi-component output(s) so it’s more: “in the very small neighboorhood of the output is an exact solution for a nearby input”.
First let’s define the convention that will be used:
- A right-handed coordinate system with
is to the right is forward and is up.
There are various standard constructions. We might start with a pair of positions for the “entity” in question. Call
where
Given the convention and inputs we can compute the local right unit vector
from which we can compute the local up unit vector
Reduced standard method
Let’s look at a standard construction with a reduced problem statement. First we’ll limit the input forward vector
Since we can just normalize prior to calling our routine this is minor. The next limitation is bit bigger. We’re going to fix the parent up to
Let’s note the key properties:
- we’re producing an orthonormal basis so:
- the
component of is zero so it’s bound to the -plane and can be expressed as rotated about
The above expressions are written to reflect how the computation is going to be performed. Now a tempting change would be to replace
Danger, Will Robinson!
we know that our input is only approximately unit and (even if carefully computed) each component will (generally) have small errors. Let’s simplify and assume that:
exactly and that
This implies that
where
which is a very different result that zero. Notice that this isn’t a single precision problem and we’re in the same boat in any precision. As an example since doubles have a wider dynamic range (can represent a wider range of exponents) the issue is more pronounced. Below is a floating point number line on
All of this might seems cursed if it’s unfamiliar but it allows us to represent unit vectors very nearly pointing in cardinal directions and performing the “correct” computation is only very important if the result is highly sensitive to the sub-result. Aside: a previous blog post on quaternion/matrix conversion shows that making these kinds of transforms can drastically increase relative error even in well conditioned regions.
Ahem. Anyway. Another issue is when
For the approach let’s consider a biased version of the above expressions (with
If
if if
so if we conditionally set
Finally some noteworthy implementation details. We don’t need to carefully compute fma
but this is just for performance reasons (under the assumption that floating point contractions might be disable). Also we don’t want to perform five divisions (or even corrected multiply by reciprocal) so I’ll use the scale factor s = sqrt(1/(n+b)
which has the lowest average error of the simple choices.
typedef struct { float x,y,z; } vec3_t;
typedef struct { vec3_t f,r,u; } basis_t;
// Given unit vector 'v' compute local forward (f), right (r)
// and up (u) unit vectors (an orthonormal basis set) with
// the convention of up=Z, right=X & forward=Y. A look-at
// with the input up fixed to Z.
void lookat_std(basis_t* basis, vec3_t v)
{
// choice of the "small" cap around |z| = 1
// to treat as a deadzone to handle exactly n=0
// case as well as hard to compute cases. This
// is about as small as it can get without
// needing to move to careful computations.
// |z| should be 1 all the way up to
// sqrt(ulp(1)/2) = 0x1.0p-12 in binary32.
// So practical choices can be somewhat larger
// than this. However it should be no larger
// than ulp(1)/2 = 0x1.0p-24 in binary32 to
// keep both (y+b) and (n+b) resulting in 1
// when b=1.
static const float cap = 0x1.0p-63f; // sqrt(min_normal)
vec3_t u,r;
// n doesn't need to be carefully computed and could
// be performed as x*x + y*y. the 'fma' is for performance
// and assumption that fp contraction could be disabled.
float n = fmaf(v.x,v.x, v.y*v.y);
float b = n > cap ? 0.f : 1.f;
// proceed as normal (plus add the bias)
float y = v.y + b;
float s = sqrtf(1.f/(n+b));
float sx = -v.x*s;
float sy = y*s;
// std value : deadzone value
r.x = sy; // y/sqrt(n) : 1
r.y = sx; // -x/sqrt(n) : -x {~0}
r.z = 0.f; // 0 : 0
u.x = sx*v.z; // -xz/sqrt(n) : -x {~0}
u.y = -sy*v.z; // -yz/sqrt(n) : -1
u.z = s*n; // n/sqrt(n) : n {~0}
basis->f = v; // copy just to provide a std structure
basis->u = u;
basis->r = r;
}
Euler angles Yaw-Pitch (but no Roll)
We can parameterize a unit vector using a spherical coordinate system where the radius is fixed to one and using a mathematical style convention of
where the arctangent is the two parameter form (atan2
). We want to compute atan2
to avoid NaNs if
If we treat these angles as the yaw
So the pitch is a rotation about
the yaw is a rotation about
composing the two rotations gives:
and yanking out the columns give the local basis:
slapping this into toy code gives:
typedef struct { float yaw, pitch; } lookat_yp_t;
lookat_yp_t lookat_yp(vec3_t v)
{
lookat_yp_t r;
r.pitch = atan2f( v.z, sqrtf(fmaf(v.x,v.x,v.y*v.y)));
r.yaw = atan2f(0.f-v.x, v.y+0.f);
return r;
}
basis_t yp_to_basis(lookat_yp_t yp)
{
float cp = cosf(yp.pitch), sp = sinf(yp.pitch);
float cy = cosf(yp.yaw), sy = sinf(yp.yaw);
// 'f' reconstruction can be skipped if we have original 'v' and we
// keep the restriction that 'v' must be (approximately) unit.
vec3_t r = (vec3_t){.x= cy, .y= sy, .z=0.f };
vec3_t f = (vec3_t){.x=-sy*cp, .y= cy*cp, .z=sp };
vec3_t u = (vec3_t){.x= sy*sp, .y=-cy*sp, .z=cp };
return (basis_t){.r=r, .f=f, .u=u };
}
I haven’t bothered with introducing the small cap region since atan2
will manange to yield reasonable results. But notice the funky inclusion of zeroes in both inputs of the yaw atan2
. This is to normalize the result to zero when both f
in yp_to_basis
is cheaper than performing it elsewhere. We’ve already performed the bulk of the work.
Oh! And as an example of how easy it is to get
phi = RN(pi) // 0x1.921fb6p1
r = {RN(cos(phi)), RN(sin(phi)), 0} // {-1, -0x1.777a5cp-24, 0}
where RN
is the default floating-point rounding (round to nearest, ties to even).
Two complex decomposition not sorry for the bad pun
We can take the previous and carry through the inverse trig followed by forward trig operations:
Of course we have all the terms we needed in the vector construction and get identical expressions if we perform the
reconstruction. Vector expressions (without bias) repeated for convenience:
Other than showing the equivalence of the Yaw-Pitch and vector constructions it seems interesting to show the “same thing” in a different way in case anybody has an “ah ha!” moment. Plus it’s a natural stepping stone to the next section. The section name comes from we can think of this as a pair of independent unit complex numbers. One for the
But anyway coded up explictly this way gives:
typedef struct {
float cp,sp; // cos(pitch), sin(pitch)
float cy,sy; // cos(yaw), sin(yaw)
} lookat_tc_t;
lookat_tc_t lookat_tc(vec3_t v)
{
lookat_tc_t r;
// SEE: lookat_std
static const float cap = 0x1.0p-63f;
float n = fmaf(v.x,v.x, v.y*v.y);
float b = n > cap ? 0.f : 1.f;
float s = sqrtf(1.f/(n+b));
float y = v.y + b;
r.cp = n * s;
r.sp = v.z;
r.cy = y * s;
r.sy = -v.x * s;
return r;
}
basis_t tc_to_basis(lookat_tc_t tc)
{
// 'f' reconstruction can be skipped if we have original 'v'
vec3_t r = (vec3_t){.x= tc.cy, .y= tc.sy, .z=0.f };
vec3_t f = (vec3_t){.x=-tc.sy*tc.cp, .y= tc.cy*tc.cp, .z=tc.sp };
vec3_t u = (vec3_t){.x= tc.sy*tc.sp, .y=-tc.cy*tc.sp, .z=tc.cp };
return (basis_t){.r=r, .f=f, .u=u };
}
Swing-twist decomposition
The short version of a swing-twist decomposition decomposition (click through for more) is that we factor a rotation into a pair of quaternions. The first is a rotation about a fixed axis (here
Recall that in quaternions we need half the angle of the desired rotation so the terms are squared. So what we want is this:
Replacing with the direct terms from the previous section:
with some substitutions:
and we need to apply the half-angle indentities. Starting with swing:
Equation
The twist expands as:
where
as a foreshadowing of how I’ll attack negative
and after halving the angle with respect to
If the input
So the first step can be to compute pair values
If the input
After performing the steps
Finally we need to tweak the bias application. To get a one in the denominator we need to bias
// swing-twist decomposition of look-at
// specialized version for limited swing/twist
// components
typedef struct {
float sb,sw; // sw + sb(1,0,0)
float tb,tw; // tw + tb(0,0,1)
} st_lookat_t;
st_lookat_t lookat_st(vec3_t v)
{
static const float cap = 0x1.0p-63f; // sqrt(min_normal)
st_lookat_t st;
float n = fmaf(v.y,v.y, v.x*v.x);
float c = sqrtf(n);
// swing:
#if 1
// two square-root version
st.sw = sqrtf(0.5f*(1.f+c));
st.sb = copysignf(sqrtf(0.5f*(1.f-c)),v.z);
#else
// divide & square-root version
float rs = sqrtf(0.5f/(1.f+c));
st.sw = fmaf(rs,c,rs);
st.sb = rs * v.z;
#endif
// twist: reduce to first quandrant
float b = n > cap ? 0.f : 0.5f; // bias if inside deadzone
float y = fabsf(v.y) + b + b;
float x = fabsf(v.x);
float sx = f32_xor(-v.x,x); // -sgn(x) = sgn(-x) { for copysignf removal }
float t2 = fmaf(c,y,n+b); // n+cy
float ts = sqrtf(0.5f/t2); // 1/sqrt(2(n+cy))
st.tw = ts * (c+y); // (c+y)/sqrt(2(n+cy))
st.tb = ts * x; // x/sqrt(2(n+cy))
// reflect about diagonal if y is negative
if (v.y < 0.f) { float t = st.tb; st.tb = st.tw; st.tw = t; }
// reintroduce sign of x (reflect about y)
//st.tb = copysignf(st.tb, -v.x); // this or following line
st.tb = f32_xor(st.tb, sx); // see below
return st;
}
// specialized swing/twist to quat
quat_t st_to_quat(st_lookat_t st)
{
quat_t q;
q.x = st.sb * st.tw;
q.y = st.sb * st.tb;
q.z = st.sw * st.tb;
q.w = st.sw * st.tw;
return q;
}
Quaternion
We can get the direct-to-quaternion formulation from the swing-twist by performing the composition:
Taking
and expanding
The term inside the root could be expanded but I’ll just leave as is and compute it and the other two
quat_t lookat_quat(vec3_t v)
{
static const float cap = 0x1.0p-63f; // sqrt(min_normal)
// for twist we need point in disc: p = (x,y)
float n = fmaf(v.y,v.y, v.x*v.x); // n = dot(p,p)
float c = sqrtf(n);
float b = n > cap ? 0.f : 0.5f;
// twist: reduce to first quandrant
float x = fabsf(v.x);
float y = fabsf(v.y) + b + b;
//float sx = f32_xor(-v.x,x); // -sgn(x) = sgn(-x)
float t = fmaf(c,y,n+b); // n+cy
float s = sqrtf(0.25f / fmaf(c,t,t)); // 1/(2 sqrt((1+c)(n+cy)))
float tw = fmaf(s,y,s*c);
float tb = s*x;
// reflect about diagonal if y is negative
if (v.y < 0.f) { float t = tb; tb = tw; tw = t; }
// reintroduce sign of x (reflect about y)
// use copysignf version this time
//tb = f32_xor(tb, sx);
tb = copysignf(tb, -v.x);
// compose the result
quat_t q;
q.x = v.z * tw; // z tw
q.y = v.z * tb; // z tb
q.z = fmaf(c,tb,tb); // (1+c)tb
q.w = fmaf(c,tw,tw); // (1+c)tw
return q;
}