I going to run through some basic quaternion quantization and only considers the worst case situation in terms of error vs. size:

  • We have no context or prediction
  • We have no known probability distributations
  • We only consider quantization/reconstruction (no entropy coding, performace issues, etc)
  • Fixed sized bit-package

Basically we will assume that the quaternions being quantized are from a uniform random source with no restrictions. Or another way they are from an IID point sequence on the 3-sphere. Given this toy problem statement I will consider the important metric be minimal peak magnitude error.

Minimal notation summary: I will write a unit quaternion as:

\[\begin{eqnarray*} Q &=& a + \left(x,~y,~z\right) \\ &=& a + b~\mathbf{u} \\ &=& \cos\left(\theta\right) + \sin\left(\theta\right)~\mathbf{u} \end{eqnarray*}\]

So scalars are lowercase, I put the scalar part first, lowercase bold face for 3D (bi)vectors and I use radian angle measures in $\mathbb{H}$ and not $\mathbb{R}^3$ except all error values which will be in degrees in $\mathbb{R}^3$. I will also assume that the scalar is positive, if not the sign of it will need to be carried through.

The following table is a recap of the Half-angle/Cayley transform post1:

Name Equation $d\left(\theta\right)$ $ d'\left(\theta\right) $ max RT error
identity (projected) $\left<Q\right>_2$ $\sin\left(\theta\right) $ $\cos\left(\theta\right) $ 0.055939
half-angle (projected) $\left<\sqrt{Q}\right>_2$ $\sin\left(\frac{\theta}{2}\right) $ $\frac{1}{2} \cos\left(\frac{\theta}{2}\right) $ 0.000075
Cayley transform $\left(Q-1\right)\left(Q+1\right)^{-1}$ $\tan\left(\frac{\theta}{2}\right) $ $\frac{1}{2} \cos\left(\frac{\theta}{2}\right)^{-2} $ 0.000045
harmonic mean $\left(\sqrt{Q}-1\right)\left(\sqrt{Q}+1\right)^{-1}$ $\tan\left(\frac{\theta}{4}\right) $ $\frac{1}{4} \cos\left(\frac{\theta}{4}\right)^{-2} $ 0.000070
exponential map $\log Q$ $\theta$ 1 0.000087

Where bivector extraction (rank-2) is denoted and expressed algebraically:

\[\left<Q\right>_2 = \frac{1}{2}\left(Q-Q^*\right)\]

All the transforms map a unit quaternion to a point in a 3D ball in the direction of the axis of rotation. I’ll refer to this type of transform as ball maps. The table entries are:

  • Distance from the origin of $\theta$ is $d\left(\theta\right)$, which is the cumlative density.
  • The derivative $ d'\left(\theta\right) $ is the density.
  • max RT error is the maximum round trip error2 (without quantization) measured in $\mathbb{R}^3$ and in degrees.

Toy/proof-of-concept C code: here

Aside: in memory compression 4:3

Other than identity (drop and reconstruct the scalar) any of the ball maps perform pretty well at round tripping between 3 and 4 component representations. Performance of runtime transform(s) and if the reverse can be folded with the next operation are probably more interesting considerations.

Smallest three

One existing method of quantizing quaternions is called smallest three. This does exactly what it says: determine the component with the maximum magnitude, note which one it is and store the other three components (negated if the largest is negative). This side-line information requires two bits of storage. For whatever it’s worth we can think of this geometically as performing one of four isoclinic rotations by $1, \mathbf{x}, \mathbf{y}$ and $\mathbf{z}$.

The shape generated by this mapping in our case is that of a ball of radius $\frac{\sqrt{2}}{2}$ when then the scalar is the largest component (implied angle is on $\left[0,\frac{\pi}{2}\right]$). Beyond that point the scalar becomes one of the encoded values and the largest bivector component is dropped. This in turn fills the ball and then “inflates” slightly beyond. I haven’t bothered to formulate an equation for the surface but roughly speaking if we put an axe aligned bounding box around the ball then slightly inflate within that contraint is roughly the surface. The approximate volume once we’ve scaled up to unit extent is 5.35284 which translates to using ~66.9% of the quantization space vs ~52.4% for the unit ball assuming uniform grid as we’ll start with.

Tait-Bryan ZYX and Euler ZXZ

In a prequel post3 I ran through the basic math and some computation issues wrt to error for both of these system.

An upside to both of these is they use all of the quantization space without any extra effort. The downside is they are computationally expensive for both encode and decode. Notiably (as presented) converting to ZYX requires two two parameter atans and one one parameter and produces:

\[\theta \in \left[-\frac{\pi}{2},\frac{\pi}{2}\right] \\ \phi,\psi \in \left[-\pi,\pi\right]\]

Converting to ZXZ is slightly nicer only requiring one two parameter atan and two single parameter. However the formulation that I gave in that post does not produce minimal range angles, specifically returning:

\[\beta \in \left[0, \pi \right] \\ \alpha,\gamma \in \left[-\frac{3\pi}{2},\frac{3\pi}{2}\right]\]

So $\alpha$ and $\gamma$ need to be reduced to $\left[-\pi,\pi\right]$. Reducing the values to range can be folded with the argument reduction of the two parameter atan, which I’ll ignore for the moment.

If either of these methods are determined to be of interest we can examine its error performance and potentially replace the arctangents and/or forward trig operations with lower accurancy approximations.


The next few sections are simply a recap of the transforms in the half-angle/Cayley post combined with scaling to the unit ball. So you might want to simply skip to “Example configuration and results” section.

Basic half-angle $\left<\sqrt{Q}\right>_2$

Recall if we take the square-root of a unit quaternion ($Q$) we half its implied angle and we can reverse this by squaring. By choosing $Q$ with a zero or positive scalar this insures that the scalar ($w$) is the largest component so it can be dropped. The result is a radius $\sin\left(\frac{\theta_m}{2}\right)$ ball which in our case is $\frac{1}{\sqrt{2}}$. We can rescaling the equation so we are mapping to the unit ball.

The forward and inverse transform pairs then become:

\[Q = w + \left(x,~y,~z\right) \rightarrow \frac{1}{\sqrt{1+w}} \left(x,~y,~z\right) = p\] \[p = \left(x,~y,~z\right) \rightarrow \left(1-p \cdot p \right) + \sqrt{2-p \cdot p} \left(x,~y,~z\right) = Q\]

GLSL code snippets for the forward and inverse transforms (assuming $w$ is already positive)4:

vec3 quat_fha(vec4 q)
  float s = inversesqrt(1.0+q.w);
  return s*;

vec4 quat_iha(vec3 p)
  float d = dot(p,p);
  float s = sqrt(2.0-d);
  return vec4(s*, 1.0-d);

Basic Cayley $\left(Q-1\right)\left(Q+1\right)^{-1}$

The Cayley transform maps a unit quaternion with a zero/positive scalar to a ball with radius $\tan\left(\frac{\theta_m}{2}\right)$ which is the unit ball in the case we are considering. The forward and inverse transforms are:

\[Q = w + \left(x,~y,~z\right) \rightarrow \frac{1}{1+w} \left(x,~y,~z\right) = p\] \[p = \left(x,~y,~z\right) \rightarrow \left(1-\frac{2}{1+p \cdot p}\right) + \frac{2}{1+p \cdot p} \left(x,~y,~z\right) = Q\]

GLSL code snippets for the forward and inverse transforms (assuming $w$ is already positive)4:

vec3 quat_fct(vec4 q)
  float s = 1.0/(1.0+q.w);
  return s*;

vec4 quat_ict(vec3 v)
  float s = 2.0/(1.0+dot(v,v));
  return vec4(s*, 1.0-s);

Basic harmonic mean $\left(\sqrt{Q}-1\right)\left(\sqrt{Q}+1\right)^{-1}$

The harmonic mean is the compostion of half-angle followed by Cayley. This produces a radius $\tan\left(\frac{\theta_m}{4}\right)$ ball which in our case is: $\sqrt{2}-1$.

Rescaling to the unit ball gives the forward and inverse transform pairs:

\[Q = w + \left(x,~y,~z\right) \rightarrow \frac{1+\sqrt{2}}{1+w+\sqrt{2+2w}} \left(x,~y,~z\right) = p\] \[p = \left(x,~y,~z\right) \rightarrow \frac{1-6a+a^2}{\left(1+a\right)^2} + \frac{4\left(\sqrt{2}-1\right)\left(1-a\right)}{\left(1+a\right)^2}\left(x,~y,~z\right) = Q\]


\[\begin{eqnarray*} a &=& \left(3-2\sqrt{2}\right)p \cdot p \end{eqnarray*}\]

GLSL code snippets for the forward and inverse transforms (assuming $w$ is already positive)4:

const float Km  = 4.0*(0.4142135679721832275390625); // 4(sqrt(2)-1)
const float Khf = 2.414213657379150390625;           // sqrt(2)+1 = 1/(sqrt(2)-1)
const float Khi = 0.17157287895679473876953125;      // 3-2sqrt(2)

vec3 quat_fhm(vec4 q)
  float s = Khf/(1.0+q.w+sqrt(2.0+2.0*q.w));
  return s*;

vec4 quat_ihm(vec3 v)
  float d = Khi*dot(v,v);
  float a = (1.0+d);
  float b = (1.0-d)*Km;
  float c = 1.0/(a*a);
  vec4  q; = v*b*c;
  q.w   = (1.0+d*(d-6.0))*c;

Basic exponential map $\log Q$

Taking the logarithm produces a $\theta_m$ radius ball, so in our case we have $\frac{\pi}{2}$.

Rescaling to the unit ball gives the forward and inverse transform pairs:

\[Q = w + \left(x,~y,~z\right) \rightarrow \frac{\frac{2}{\pi}~ \text{atan}\left(\frac{\sqrt{1-w^2}}{w}\right)}{\sqrt{1-w^2}} \left(x,~y,~z\right) = p\] \[p = \left(x,~y,~z\right) \rightarrow \cos \left(\frac{\pi}{2} \sqrt{p \cdot p}\right) + \frac{\sin\left( \frac{\pi}{2} \sqrt{p \cdot p} \right)}{ \frac{\pi}{2} \sqrt{p \cdot p}} \left(x,~y,~z\right) = Q\]

GLSL code snippets for the forward and inverse transforms (assuming $w$ is already positive)4:

vec3 quat_fem(vec4 q)
  float w = q.w;
  float a = 1.0-w*w;
  float b = inversesqrt(a+EPS);
  float k = a*b;
  float s = (2.0/PI)*atan(k/w)*b;
  return s*v;

vec4 quat_iem(vec3 v)
  float d = dot(v,v);
  float t = inversesqrt(d+EPS);
  float a = (PI/2.0)*t*d;
  float s = sin(a);
  float k = s*t;
  return vec4(k*, cos(a));

Example configuration and results

Now let’s look at some basic quantization with:

  • allowing an arbitrary number of bits in the fixed package (not requiring each component to have equal number of bits)
  • linear quantized with round to negative infinity for quant and center reconstruct for dequant. This choice means we cannot exactly represent zero and therefore identity. I’m not considering this a problem since this post isn’t useful for relative/update information.
  • on the standard integer lattice (Cartesian/uniform grid)

Let’s examine how the maximum error globally behaves as we add bits to each transform, starting from 9 up to 32 bits. If we have a $n$ bit package and $n \bmod 3 = 1$ then the last component gets an extra bit and if $n \bmod 3 = 2$ then the last two get an extra bit each. As a log plot we get:

The overall is that this follows the reasonable expectation that the error should approximately linearly decrease in a log-plot since adding a bit doubles the number of representable values. We can also see that ZYX and ZXZ are very close with ZXZ always the slightly better. Since quaternion to ZXZ can be formulated with lower runtime cost than ZYX we can stop considering ZYX. If we examine the methods with the lowest error at each bit size we will notice a pattern emerges. A table of the top three:

$n \bmod 3$ best second third
0 exp-map harmonic-mean ZXZ
1 exp-map ZXZ harmonic-mean
2 ZXZ exp-map harmonic-mean

Now let’s take the previous graph and view the relative effectiveness of adding a bit $\left(\frac{\text{error}\left(n+1\right)}{\text{error}\left(n\right)}\right)$:

This is a bit of a mess because half-angle isn’t well behaved here. Click on it in the legend to turn off that trace. Now the graph is pretty much two sets of traces: ZXZ vs. the ball maps and the $n \bmod 3$ behavior is clear.

$n \bmod 3$ maps ZXZ
0 0.7071 0.9220
1 0.8660 0.7670
2 0.8165 0.7071

For the ball maps all three components are equally important so the largest improvement is when the points form a cube (mod is zero). ZXZ has one angle with half the range so it’s best is when mod is 2. Adding three bits halves the error for both types.

Although the toy problem statement only cares about global peak error let’s go ahead and examine error as a function of rotation angle. If we re-examine the density function from Half-angle/Cayley post we expect that higher density should result in lower errors:

Now a sequence of error vs rotation angle plots for each mod n behavior:

These plots still don’t show the whole picture. The part we are ignoring (for ball maps) is the impact of the direction of the bivector (axis of rotation). Some things to note:

  • error plots of the ball maps are inversely related to density as expected and they behave the same regardless of bit-size
  • ZYZ in it’s various position in the top three
  • I’m showing identity map projected to ball which is useless for full range input. The point of including it is to compare with smallest-three. Specifically in the 32-bit case there’s only mimimal impact to error on the range where they are the same transform. So sideline information can be effective espectially when they are replacing what would otherwise be less effective given the bit allocation.

Ideally I would now show some 3D density (heat-map) plots on the unit ball with a cut-out for the various ball maps. I’m too lazy to do that ATM, so instead let’s look at best and worst case directions. The issue is our quantization space is currently a cube with a volume of eight, we’re mapping to a unit ball (volume $\frac{4}{3}\pi \approx 4.189$) so we are wasting $1-\frac{\pi}{6}$ or approximately 47.6% of the space. If we have $n$ sample in one extent, then that’s how many we’ll pass for an axial aligned rotation and only $\frac{1}{\sqrt{3}}n$ if the axis is along a diagonal (and they’re farther apart). A 12-bit encoding of the exponential map clearly shows:

For ball maps with 30 bit encoding:

Note the behaviour of the half-angle transform (click away the others). We can leave aligned quantization points alone and improve along the diagonal by adding radial stretching to the encode/decode. The global, along x and diagonal now look like:

The error along the diagonal has been overkill reduced. It seems likely some tweaks could produce a better global result by moving some of the point away from the diagonal toward higher peak error directions, but I’m not going to make the effort without looking at some density plots. Let’s look at half-angle plus radial stretch vs. the previous top performing methods at 30, 31 and 32-bit packages:

So half-angle + radial stretch takes over the top spot for 30-bit, is very close second at 31-bits and loses it completely for 32-bit packages. Recall that half-angle isn’t well behaved at $n \bmod 3$ effectiveness, so these different package size choice will vary.

Smallest-three encoded with 32-bits shows that someone motivated could probably give it a nice bump in a similar manner.

Breaking point

That’s enough for one post. I do have some more bits and pieces on this topic before moving to context and prediction based.

References and Footnotes

  1. “Quaternion half/double angle and Cayley transforms” (local post

  2. Measured using the toy C implemenations. 

  3. “Converting to Euler & Tait-Bryan” (local post

  4. Otherwise replace $w$ by $\abs{w}$ and add $\text{sgn}\left(w\right)$ to the final scale value. In this case the inverse transform returns $-Q$ when $w$ is negative.  2 3 4


© 2023 Marc B. Reynolds
all original content is public domain under UNLICENSE