Toy code can be found: HERE


Prelim the dead horse beating


Skimming the math. Starting with a quaternion $Q=w+\left(x,y,z\right)$ then we can rotate $\mathbf{p}$ by:


and if $Q$ is unit magnitude this reduces to:


If $Q$ with magitude of $s$ is transformed by $\eqref{eq:rsxform}$ then the result is the composition of the rotation and uniform scaling of $s^2$.

To create a matrix we need to apply the rotation to the basis set to form our three equations:

which expanded and reduced gives:



Quaternion to rotation matrix


Sticking to the math convention of column vectors, then we can shove the (not reduced) equations $\eqref{xp} \eqref{yp} \eqref{zp}$ into the columns giving:


and we can reduce the diagonal elements since $Q\cdot Q = 1$ as per $\eqref{xpr} \eqref{ypr} \eqref{zpr}$:


Translating both into C gives:

// not reducing the diagonal. adds ~4 (basic simple scalar) ops
// vs. the standard version.
void quat_to_mat33_ndr(mat33_t* m, quat_t* q)
{
  float x  = q->x, y  = q->y, z  = q->z, w  = q->w;
  float xx = x*x,  yy = y*y,  zz = z*z,  ww = w*w;
  float tx = 2*x,  ty = 2*y,  tz = 2*z;
  float xy = ty*x, xz = tz*x, yz = ty*z;
  float wx = tx*w, wy = ty*w, wz = tz*w;
  float t0 = ww-zz;
  float t1 = xx-yy;
  
  m->m00 = t0+t1;
  m->m11 = t0-t1;
  m->m22 = ww-xx-yy+zz;

  m->m10 = xy+wz; m->m01 = xy-wz;
  m->m20 = xz-wy; m->m02 = xz+wy;
  m->m21 = yz+wx; m->m12 = yz-wx;
}

// commonly seen variant
void quat_to_mat33_std(mat33_t* m, quat_t* q)
{
  float x  = q->x, y  = q->y, z  = q->z, w  = q->w;
  float tx = 2 *x, ty = 2 *y, tz = 2 *z;
  float xx = tx*x, yy = ty*y, zz = tz*z;
  float xy = ty*x, xz = tz*x, yz = ty*z;
  float wx = tx*w, wy = ty*w, wz = tz*w;

  m->m00 = 1.f-(yy+zz); m->m11 = 1.f-(xx+zz); m->m22 = 1.f-(xx+yy);

  m->m10 = xy+wz; m->m01 = xy-wz;
  m->m20 = xz-wy; m->m02 = xz+wy;
  m->m21 = yz+wx; m->m12 = yz-wx;
}


Another aside: In cases where input quaternions are moving away from unity due to compounding errors it is possible to fold the corrective step into the transform instead of renormalizing as a pre-step. As an example:

void quat_to_mat33_nu(mat33_t* m, quat_t* q)
{
  float x  = q->x, y  = q->y, z  = q->z, w  = q->w;
  float xx = x*x,  yy = y*y,  zz = z*z,  ww = w*w;
  float s  = 2.f*recipf(xx+yy+zz+ww);
  float sx = s*x,  sy = s*y,  sz = s*z;
  float xy = sy*x, xz = sz*x, yz = sy*z;
  float wx = sx*w, wy = sy*w, wz = sz*w;

  m->m00 = 1.f-s*(yy+zz); m->m11 = 1.f-s*(xx+zz); m->m22 = 1.f-s*(xx+yy);

  m->m10 = xy+wz; m->m01 = xy-wz;
  m->m20 = xz-wy; m->m02 = xz+wy;
  m->m21 = yz+wx; m->m12 = yz-wx;
}



Rotation matrix to quaternion


Let’s rename the elements of $\eqref{2matndr}$ as:


then the sum and difference of off diagonal pairs give:


and the permutations of trace like functions:


which can be reworked to:


If we know or determine an element has a sufficiently large magnitude we can choose the equation for that element from the second set of equations and three equations from the first set which are products of the that and the remaining elements.

As an example take a special case version where we know that the magnitude of the angle is not greater than $\frac{\pi}{2}$ which in turn ensures that the scalar $(w)$ is the largest element. Then we need $\eqref{4w2}$ from the second set and $\eqref{4wx} \eqref{4wy} \eqref{4wz}$ from the first.


void mat33_2_quat_small(quat_t* q, mat33_t* m)
{
  float m00=m->m00, m01=m->m01, m02=m->m02;
  float m10=m->m10, m11=m->m11, m12=m->m12;
  float m20=m->m20, m21=m->m21, m22=m->m22;
  float t  = 1 + m00 + m11 + m22;  // 4w^2
  float s  = 0.5f*rsqrtf(t);       // 1/(4w)
  quat_set(q, s*(m21-m12), s*(m02-m20), s*(m10-m01), s*t);
}


Standard construction

The standard construction for the general case is based on the observation that since we’re producing a unit quaternion, then the smallest that the largest magnitude component can be is $\frac{1}{2}$. Given that we can simply walk through the components and choose the first that’s at least that large.

void mat33_to_quat_std(quat_t* q, mat33_t* m)
{
  float m00=m->m00, m01=m->m01, m02=m->m02;
  float m10=m->m10, m11=m->m11, m12=m->m12;
  float m20=m->m20, m21=m->m21, m22=m->m22;
  float t1 = m22+m11;
  float t0 = m00+t1;

  if (t0 > 0.f) {
    float w = 1.f+t0, s = 0.5f*rsqrtf(w);
    quat_set(q, s*(m21-m12), s*(m02-m20), s*(m10-m01), s*w); // w
    return;
  }

  t0 = m00-t1;
  
  if (t0 > 0.f) {
    float x = 1.f+t0, s = 0.5f*rsqrtf(x);
    quat_set(q, s*x, s*(m01+m10), s*(m02+m20), s*(m21-m12)); // x
    return;
  }

  t0 = m11-m22; t1 = 1.f-m00;
  
  if (t0 > 0.f) {
    float y = t1+t0, s = 0.5f*rsqrtf(y);
    quat_set(q, s*(m01+m10), s*y, s*(m12+m21), s*(m02-m20)); // y
  }
  else {
    float z = t1-t0, s = 0.5f*rsqrtf(z);
    quat_set(q, s*(m02+m20), s*(m12+m21), s*z, s*(m10-m01)); // z
  }
}


Given uniform random input then the (in code order) probabilities are: $\frac{2}{3},\frac{1}{9},\frac{1}{9},\frac{1}{9} $ The first from $\frac{2}{\pi}\text{acos}\left(\frac{1}{2}\right) = \frac{2}{3}$.


Mike Day’s construction

Insomniac Games (Mike Day) presented a paper1 in 2015 which makes the following observations. The sign of diagonal components imply $\eqref{2matstd}$:


and the permutations of sum/difference of pairs of diagonal elements imply:

They use the above to form a method that insures to choose a sufficiently large element. The following is a slightly massaged version of Day:

void mat33_2_quat_day(quat_t* q, mat33_t* m)
{
  float m00=m->m00, m01=m->m01, m02=m->m02;
  float m10=m->m10, m11=m->m11, m12=m->m12;
  float m20=m->m20, m21=m->m21, m22=m->m22;
  float e0;

  if (m22 >= 0) {
    float a = m00+m11;
    float b = m10-m01;
    float c = 1.f+m22;

    if (a >= 0) { 
      e0 = c + a;
      quat_set(q, m21-m12, m02-m20, b, e0); // w
    }
    else { 
      e0 = c - a;
      quat_set(q, m02+m20, m21+m12, e0, b); // z
    }
  }
  else {
    float a = m00-m11;
    float b = m10+m01;
    float c = 1.f-m22;
    
    if (a >= 0) {
      e0 = c + a;
      quat_set(q, e0, b, m02+m20, m21-m12); // x
    }
    else {
      e0 = c - a;
      quat_set(q, b, e0, m21+m12, m02-m20); // y
    }
  }

  quat_scale(q, 0.5f*rsqrtf(e0));
}


Given uniform random input then the outer and both inner probabilities are: $\frac{1}{2}$.


Angle errors


Let’s examine round-trip errors. We generate a uniform quaternion $A$, convert to a matrix, back to a quaternion $B$ and then find the relative quaternion $X$:


then measure the represented angle, multiple by two (to get 3D rotation measure instead of in $\mathbb{H}$) and convert to degrees:


This choice of measure ignores the magnitudes of $A$ and $B$. The trace names are to-matrix-method/to-quat-method. The x-axis is normalized.


There’s a jump in trace behavior at $\frac{2}{3}$ because our x-axis is a function of $w$ and this the point where $w=\frac{1}{2}$. The forward transform without diagonal reduction has lower peak error, but potentially more interesting is the fact that this method is better at angle-preserving (conformal) than the standard.

Variants you probably don’t care


Branch free Day


The reworking of Day above is to point out the we can formulate a branch-free version. Let’s call a function to isolate the sign bit of a float $sb$ and denote an integer XOR as $\oplus$. The initial branch is selected based on the sign of $m_{22}$. In both cases we’re computing three temp variables $a,b,c$ which can be expressed in terms of that sign: $s_0 = \text{sb}\left(m_{22}\right)$. Then inside the inner four cases we’re computing $e_0$ (the chosen sufficiently large element) which can be expressed in terms of the sign of $a$: $s_1 = \text{sb}\left(a\right)$. ($c$ and $e_0$ can use abs instead). The remaining two elements (unnamed temp expressions) can be formed using $s_1$ and the parity of two sign values: $s_p = s_0 \oplus s_1$. If we rename $b$ to $e_1$, then so far we have:


To complete the component computation we simply need to scale all four by $\frac{1}{2 \sqrt{e_0}}$. This leave placing them in memory, if we has storage order of $\left(x,~y,~z,~w\right)$ and think of our elements as the ordered list: $\left(e_3,~e_2,~e_1,~e_0 \right) $, then we apply the follow two rules (input to the second is renaming the output of the first)

  • if $s_0$ is set $\left(e_3,~e_2,~e_1,~e_0 \right) \rightarrow \left( e_1,~e_0,~e_3,~e_2 \right) $, so flip the first and last two.
  • if $s_p$ is set $\left(e_3,~e_2,~e_1,~e_0 \right) \rightarrow \left( e_2,~e_3,~e_0,~e_1 \right) $, so flip the first and second pairs.

Alternately we can simply compute a value to XOR as below:

void mat33_to_quat_day_bf(quat_t* q, mat33_t* m)
{
  float m00=m->m00, m01=m->m01, m02=m->m02;
  float m10=m->m10, m11=m->m11, m12=m->m12;
  float m20=m->m20, m21=m->m21, m22=m->m22;

  // isolate a sufficently large component
  uint32_t s0 = sb(m22);
  float    a  = m00 + fxor(m11,s0);              // m00 + (m11^s0)
  uint32_t s1 = sb(a);
  uint32_t sp = s0^s1;
  float    e0 = 1.f + fxor(m22,s0) + fxor(a,s1); // 1 + |m22| + |a|
  float    s  = 0.5f*rsqrtf(e0);

  // the other three components
  float    e1 = m10-fxor(m01,s0);                // m10 - (m01^s0)
  float    e2 = m02-fxor(m20,sp);                // m02 - (m02^sp)
  float    e3 = m21-fxor(m12,s1);                // m21 - (m12^s1)

  e0 *= s; e1 *= s; e2 *= s; e3 *= s;            // scaling

  uint32_t id = (s0 >> 30)^(sp >> 31);           // write ordering

  q->f[3^id] = e0;
  q->f[2^id] = e1;
  q->f[1^id] = e2;
  q->f[0^id] = e3;
}


Branch free higher error


Another branch-free possibility is to use the second set of equations to compute the magnitudes of all the elements, choose the scalar to be positive and compute the sign of the bivector components by the first set of equations.

void mat33_2_quat_bf0(quat_t* q, mat33_t* m)
{	
  float m00=m->m00, m01=m->m01, m02=m->m02;
  float m10=m->m10, m11=m->m11, m12=m->m12;
  float m20=m->m20, m21=m->m21, m22=m->m22;

  float a = 1.f+m00;
  float b = 1.f-m00;
  float c = m11+m22;
  float d = m11-m22;

  // abs to prevent errors, could use max against zero
  // or a bias amount as per below.
  float w = fabsf(a+c); // 4w^2
  float x = fabsf(a-c); // 4x^2
  float y = fabsf(b+d); // 4y^2
  float z = fabsf(b-d); // 4z^2

  // assumes rsqrtf(0) = inf, otherwise needs to add a bias
  w *= 0.5f*rsqrtf(w);
  x *= 0.5f*rsqrtf(x);
  y *= 0.5f*rsqrtf(y);
  z *= 0.5f*rsqrtf(z);

  // use first set for the signs
  x = copysignf(x, m21-m12);
  y = copysignf(y, m02-m20);
  z = copysignf(z, m10-m01);

  quat_set(q, x,y,z,w);
}


which as promised results in much higher errors:


Almost branch free


Yet another option is to notice that special case small angle version performs rather well. These plots again are:


So we can formulate a version that has a highly predictable branch that handles the common case and otherwise fall-back.

void mat33_to_quat_bf1(quat_t* q, mat33_t* m)
{
  float m00=m->m00, m01=m->m01, m02=m->m02;
  float m10=m->m10, m11=m->m11, m12=m->m12;
  float m20=m->m20, m21=m->m21, m22=m->m22;
  float t  = 1.f + m00 + m11 + m22;   // 4w^2

  if (t > BF1_CUT) {
    float s  = 0.5f*rsqrtf(t);
    quat_set(q, s*(m21-m12), s*(m02-m20), s*(m10-m01), s*t);
    return;
  }

  // add handling other cases here
}


A lazy table of empirical numbers: some ad-hoc cut points, probability and round-trip errors using the standard method of quat-to-matrix:

BF1_CUT probability rt-error
0.00001 0.998002 0.042664
0.0001 0.993634 0.014877
0.0002 0.990994 0.010233
0.0004 0.987265 0.007769
0.0008 0.981993 0.004991
0.0016 0.974531 0.003587
0.0032 0.963969 0.002935
0.0064 0.949045 0.001946
0.0128 0.927966 0.001367
0.0256 0.898184 0.000975
0.0512 0.856216 0.000753
0.1024 0.797129 0.000528



References and Footnotes

  1. “Converting a Rotation Matrix to a Quaternion”, Mike Day, 2015 (download page