You know if someone had asked me this question a few months ago I would have responded that it’s was very unlikely and my facial expression would have looked like the person was eating huge globs of mayonnaise straight out of the jar. Let’s jump backward and ballpark some defs:

  • An involution is a function that’s its own inverse: $f\left(f\left(x\right)\right) = f^2\left(x\right) = x$
  • A bijection is a function that has an inverse: $f^{-1}\left(f\left(x\right)\right) = f\left(f^{-1}\left(x\right)\right) = x$

where the $n$ in $f^n$ is denoting $n$ iterations of $f$. And I’m using “bit finalizer” to mean a bit mixing function like that typically used as the final step of a hash function. I’m going to be using the terminology and heat map figures from the previous post “Comments on the Avalanche Effect”.

Now back to the beginning. I have a couple of issues with involutions as bit finalizers. The first is there aren’t many problems that need such a thing (roughly encoding and decoding are the same thing). The second issue it’s more computational work to hit reasonable properties but less that I would have previously thought so here we are…me typing.

Code for this post can be found here and for quick eye-balling of functions on godbolt.


The set up


What got me thinking about this problem was a question on twitter by paniq where he wanted to pack a 2D coordinate in a integer, perform an involution and unpack back to 2D. This would allow to “statelessly” swap the data between pairs of coordinates for an entire grid in a random manner. My suggested attempt at a solution was something like this:

// bad idea #1
uint32_t g0(uint32_t x)
{
  if (x & 1) {
    // call this f(x)
    x *= 0xac564b05;
    x += 0x85ebca77;
  }
  else {
    // and this f^{-1}(x)
    x -= 0x85ebca77;
    x *= 0xdc33c9cd;
  }
  return x;
}


This is a pair of bijections (one’s the inverse of the other) in the same form as a LCG. These have the know property that they map odd to even and even to odd. By inspecting the value of the low bit (determine if odd or even) we can choose $f\left(x\right)$ for odd and $f^{-1}\left(x\right)$ for even inputs and this composite function is an involution. The “problem” of paniq has two additional wrinkles. First we functions that work on the same size of the texture being manipulated and small sized make finding a good bit mixer harder. As an example if the texture were 256x256 we’d need an involution for a 16-bit integer. The second trick is the function needs to be parameterized so this swapping operation can been performed differently in some number of way…all of which need to be somewhat independent. These move the problem out of the territory of “fun little thing to think about” to “that sounds like a lot of work”. So I’m going to blissfully ignore these extra requirements.

The visualization of the strict avalanche criterion (SAC) bias of the above function:


The left hand is sampled using Sobol and right hand a pseudo-random sequence.



The plot thickens


Harold Aptroot spiced things up by noting that a similarity transform of an involution is an involution. Sweet! This is something I can work with. Let’s run through the justification in painful details using a matrix like notation where $A$ is a bijection and $B$ an involution:

\[\begin{align*} \left( ABA^{-1}\right)^2 & = \left( ABA^{-1} \right) \left( ABA^{-1} \right) \\ & = AB \left( A^{-1}A \right) BA^{-1} \\ & = ABBA^{-1} \\ & = AB^2A^{-1} \\ & = AA^{-1} \\ & = I \end{align*}\]

Bullet point it time:

  • We’re saying some function $ABA^{-1}$ is an involution which has a period of 2 so squaring it should yield the identity $I$. Start expanding.
  • $A$ is a bijection so it has an inverse $A^{-1}$, multipled together gives $I$ which reduces to a 70s Swedish pop group.
  • $B$ is an involution, squaring it reduces to $I$ and finally $A$ and it’s inverse cancel each other out.



Okay..but does it do anything?


The rest of this is just a quick feasibility study to see if we can construct an involution that is competitive against standard methods of bit mixing both in terms of statistical measures and runtime costs. It is pretty much a dump of what I walked through to get a feel for the answer. No real search methods were used..just me hacking stuff and seeing what happened.

First we need a fast involution to be transformed and hopefully one that performs some bit mixing as well. The most obvious choice (at least to me) is a right xorshift by at least half the bit-width of the register. Since we’re using a bit operation $\left(\mathbb{F}_2\right)$ for the involution I want to choose a modulo integer operation for the bijection (alternating fields between each logical operation). So about the cheapest thing we can do here is multiply by an odd integer (the inverse of which can be found by mod inverse). So I simply grabbed an odd constant from L’Ecuyer’s table paper and tweaked the xorshift amount until the bias was low.

#define M0 0x5f356495
#define M1 0x32c446bd

uint32_t f0(uint32_t x)
{
  x *= M0;
  x ^= (x>>25);
  x *= M1;
  return x;
}


This is doing suprisingly well for a short sequence involution. And again the above constants are going to be sub-optimal as are all the functions that follow.

Instead of beefing up the bijection I wanted to see how changing the involution to an xor-rotate would play out. First we need some helper functions:

inline uint32_t rot(uint32_t x, uint32_t i)
{
#if defined(__clang__)
  return __builtin_rotateleft32(x,i);
#elif defined(_MSC_VER)
  return _rotl(x,i);
#else
  return (x << i) | (x >> (-i & 31));
#endif
}

// I + C^a + C^b
inline uint32_t xor_rot2(uint32_t x, uint32_t a, uint32_t b)
{
  return x^rot(x,a)^rot(x,b);
}


I kept the multiplicative constants the same and tweaked the xor-rotate choice:

uint32_t f1(uint32_t x)
{
  x *= M0;
  x = xor_rot2(x,6,22);
  x *= M1;
  return x;
}


This was a nice little jump, we’re still on the cheap side and at this point I’d seen about enough. To kill off the bias on the low bits I just added (didn’t bother to tweak) right xorshift by half the bitwidth.

uint32_t f2(uint32_t x)
{
  x ^= x >> 16;
  x *= M0;
  x = xor_rot2(x,6,22);
  x *= M1;
  x ^= x >> 16;
  return x;
}


At this point we’re looking good and we’re also at about exactly the same runtime cost as current standard methods that are structured like this:

// 32-bit MurmurHash3, xxHash et al. look like this:
uint32_t common_bijective_finalizers(uint32_t x)
{
  x ^= x >> S0; x *= K0;
  x ^= x >> S1; x *= K1;
  x ^= x >> S2;
  return x;
}


As a final “I want to see what happens” test I swapped out the outer xorshift with an xor-rotate randomly chosen from those that have a period of 32 and a 3 rotate inverse:

// C^a + C^b + C^c
inline uint32_t xor_rot3(uint32_t x, uint32_t a, uint32_t b, uint32_t c)
{
  return rot(x,a)^rot(x,b)^rot(x,c);
}

uint32_t f3(uint32_t x)
{
  x = xor_rot2(x,11,16); x *= M0;
  x = xor_rot2(x, 6,22); x *= M1;
  x = xor_rot3(x,10,21,26);
  return x;
}



Crush-ing some numbers


This section spews out the results you’d see from running the example code. It measures SAC using three different sampling methods:

  1. $\text{cn}$: counting numbers (well programmer version thereof aka starting from zero) for low entropy input.
  2. $\text{ss}$: Sobol sequence to get good coverage of input space.
  3. $\text{he}$: random numbers for high entropy input.


Each set is sampled $2^{23}$ times. The max bias measured as described in the Avalanche Effect post then multipled by 100 to give a percent bias of the worst case. Also a goodness-of-fit (GOF) test is performed to get a global estimate. Specifically Pearson’s chi-squared is computed, takes the square root and multiplied by 100.

Additionally three batteries of TestU01 are run on the input. Each hash function is treated like a random number generator where we walk through counting numbers and apply the hash.

  1. Rabbit: computes 33 statistics from 26 base tests drawing approximately a user defined number of sample (specified in bits). My runs draw 1024 samples of the function so is setup for 1024*32 bits.
  2. SmallCrush: computes 15 statistics from 10 base tests drawing approximately $51,320,000$ samples
  3. Crush: computes 144 statistics from 96 base tests drawing approximately $2^{35}$ samples

The reported numbers are how many statistics of the battery failed.

For reference I’ve include 32-bit finalizer of MurmurHash3, xxHash and a pair by Chris Wellons. All except triple32 are two xorshift-multiplies followed by an xorshift and triple32 adds an additional xorshift-multiply.

Clicking on the headers sorts the data.


hash % max bias (cn) % max bias (ss) % max bias (he) GOF (cn) GOF (ss) GOF (he) Rabbit SmallCrush Crush
murmur3 0.229263 0.518417 0.207162 0.052966 0.092238 0.043021 1 2 24
xxhash32 0.377083 0.579166 0.433731 0.069322 0.090209 0.066725 1 2 39
triple32 0.135088 0.156140 0.130367 0.044136 0.045361 0.034147 1 0 5
lowbias 0.169849 0.266051 0.122666 0.047634 0.068301 0.040458 2 2 44
f2 0.409937 0.393176 0.398207 0.054149 0.070380 0.045054 1 0 13
f3 0.591612 0.445747 0.516152 0.056496 0.052190 0.045781 1 0 3


The three inital functions are failures but here’s their data as well (minus Crush since it’s a waste of time).

hash % max bias (cn) % max bias (ss) % max bias (he) GOF (cn) GOF (ss) GOF (he) Rabbit SmallCrush
g0 100.000000 100.000000 100.000000 76.090304 82.050457 76.090831 7 15
f0 100.000000 100.000000 100.000000 23.667056 38.865972 23.663481 8 14
f1 100.000000 100.000000 100.000000 20.454587 20.904367 20.456410 1 7



Conclusions


I normally don’t have a conclusion section. I talk some trash, you can dig through with a stick and decide if there’s anything you want to take home to try out. In this post I goofing around abit so I’m going to toss out my take-a-way.

  • The answer to the title is yes.


Bonus takeaways:

  • xor-rotates might be worth investigating for various bit mixing operations
  • Chris Wellons’ finalizers perform rather well



Crush output …so you don’t have to


Running the Crush battery takes about an hour so to save you the PITA here is what you would get:

========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        f1
 Number of statistics:  144
 Total CPU time:   00:26:31.17
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2                eps  
  2  SerialOver, t = 4               9.4e-4
  3  CollisionOver, t = 2           6.2e-11
  4  CollisionOver, t = 2             eps  
  6  CollisionOver, t = 4             eps  
  8  CollisionOver, t = 8             eps  
 10  CollisionOver, t = 20            eps  
 11  BirthdaySpacings, t = 2          eps  
 13  BirthdaySpacings, t = 4       2.4e-227
 15  BirthdaySpacings, t = 7          eps  
 16  BirthdaySpacings, t = 8          eps  
 17  BirthdaySpacings, t = 8          eps  
 18  ClosePairs NP, t = 2            6.7e-4
 18  ClosePairs mNP1, t = 2          2.9e-6
 18  ClosePairs mNP2, t = 2          7.3e-5
 18  ClosePairs NJumps, t = 2        2.7e-5
 23  SimpPoker, d = 16               6.2e-6
 24  SimpPoker, d = 16                eps  
 26  SimpPoker, d = 64                eps  
 28  CouponCollector, d = 4           eps  
 30  CouponCollector, d = 16          eps  
 32  Gap, r = 27                      eps  
 33  Gap, r = 0                      6.8e-7
 34  Gap, r = 22                      eps  
 36  Run of U01, r = 15               eps  
 41  MaxOft, t = 5                  1 - eps1
 41  MaxOft AD, t = 5               1.4e-10
 42  MaxOft, t = 10                 1 - eps1
 43  MaxOft, t = 20                 1 - eps1
 44  MaxOft, t = 30                 1 - eps1
 49  AppearanceSpacings, r = 0        eps  
 50  AppearanceSpacings, r = 20       eps  
 52  WeightDistrib, r = 8           1.1e-12
 54  WeightDistrib, r = 24            eps  
 63  GCD, r = 0                       eps  
 64  GCD, r = 10                      eps  
 65  RandomWalk1 H (L = 90)           eps  
 65  RandomWalk1 M (L = 90)           eps  
 66  RandomWalk1 H (L = 90)           eps  
 66  RandomWalk1 M (L = 90)           eps  
 66  RandomWalk1 J (L = 90)           eps  
 66  RandomWalk1 R (L = 90)           eps  
 66  RandomWalk1 C (L = 90)           eps  
 67  RandomWalk1 H (L = 1000)         eps  
 67  RandomWalk1 M (L = 1000)         eps  
 67  RandomWalk1 R (L = 1000)         eps  
 67  RandomWalk1 C (L = 1000)         eps  
 68  RandomWalk1 H (L = 1000)         eps  
 68  RandomWalk1 M (L = 1000)         eps  
 68  RandomWalk1 J (L = 1000)       4.0e-12
 68  RandomWalk1 R (L = 1000)         eps  
 68  RandomWalk1 C (L = 1000)         eps  
 69  RandomWalk1 H (L = 10000)        eps  
 69  RandomWalk1 M (L = 10000)        eps  
 69  RandomWalk1 R (L = 10000)      1.1e-16
 69  RandomWalk1 C (L = 10000)        eps  
 70  RandomWalk1 H (L = 10000)        eps  
 70  RandomWalk1 M (L = 10000)        eps  
 70  RandomWalk1 R (L = 10000)        eps  
 70  RandomWalk1 C (L = 10000)        eps  
 72  LinearComp, r = 29             1 - 4.7e-14
 74  Fourier3, r = 0                  eps  
 75  Fourier3, r = 20                 eps  
 77  LongestHeadRun, r = 20          7.2e-4
 80  HammingWeight2, r = 0          1 - eps1
 81  HammingWeight2, r = 20         1 - eps1
 82  HammingCorr, L = 30            1 - eps1
 83  HammingCorr, L = 300           1 - eps1
 84  HammingCorr, L = 1200          1 - eps1
 85  HammingIndep, L = 30             eps  
 86  HammingIndep, L = 30             eps  
 87  HammingIndep, L = 300            eps  
 88  HammingIndep, L = 300            eps  
 89  HammingIndep, L = 1200           eps  
 90  HammingIndep, L = 1200           eps  
 92  Run of bits, r = 20              eps  
 95  AutoCor, d = 30                  eps  
 96  AutoCor, d = 10                  eps  
 ----------------------------------------------
 All other tests were passed


========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        f2
 Number of statistics:  144
 Total CPU time:   00:26:56.26
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2                eps  
 11  BirthdaySpacings, t = 2        6.2e-82
 25  SimpPoker, d = 64               6.1e-8
 41  MaxOft, t = 5                  1 - 5.9e-10
 41  MaxOft AD, t = 5                2.9e-9
 42  MaxOft, t = 10                 1 - eps1
 43  MaxOft, t = 20                 1 - 1.7e-11
 44  MaxOft, t = 30                 1 - eps1
 49  AppearanceSpacings, r = 0      1 - eps1
 52  WeightDistrib, r = 8            2.5e-7
 82  HammingCorr, L = 30            1 -  8.4e-6
 85  HammingIndep, L = 30            5.2e-4
 95  AutoCor, d = 30                9.5e-33
 ----------------------------------------------
 All other tests were passed


========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        f3
 Number of statistics:  144
 Total CPU time:   00:27:52.02
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
 42  MaxOft, t = 10                 1 - 9.9e-12
 43  MaxOft, t = 20                 1 -  4.6e-6
 44  MaxOft, t = 30                 1 - 2.6e-14
 ----------------------------------------------
 All other tests were passed


========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        xxhash32
 Number of statistics:  144
 Total CPU time:   00:27:14.14
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2                eps  
  2  SerialOver, t = 4              8.9e-14
  3  CollisionOver, t = 2          4.2e-281
 11  BirthdaySpacings, t = 2        1.1e-93
 13  BirthdaySpacings, t = 4          eps  
 18  ClosePairs NP, t = 2           2.1e-10
 18  ClosePairs mNP, t = 2          1.0e-60
 18  ClosePairs mNP1, t = 2         1.5e-66
 18  ClosePairs mNP2, t = 2         3.4e-20
 18  ClosePairs NJumps, t = 2       6.5e-87
 23  SimpPoker, d = 16                eps  
 24  SimpPoker, d = 16                eps  
 25  SimpPoker, d = 64                eps  
 26  SimpPoker, d = 64                eps  
 28  CouponCollector, d = 4           eps  
 30  CouponCollector, d = 16         3.7e-9
 31  Gap, r = 0                       eps  
 32  Gap, r = 27                      eps  
 33  Gap, r = 0                     1.1e-14
 34  Gap, r = 22                      eps  
 35  Run of U01, r = 0              6.8e-14
 36  Run of U01, r = 15               eps  
 41  MaxOft, t = 5                  1 - eps1
 42  MaxOft, t = 10                 1 - eps1
 43  MaxOft, t = 20                 1 -  1.4e-9
 44  MaxOft, t = 30                 1 - 3.6e-15
 48  SampleCorr                      6.8e-7
 49  AppearanceSpacings, r = 0        eps  
 52  WeightDistrib, r = 8            1.9e-4
 54  WeightDistrib, r = 24          1.3e-12
 57  MatrixRank, 60 x 60             0.9993 
 81  HammingWeight2, r = 20         1 -  3.7e-6
 82  HammingCorr, L = 30            1 -  4.4e-5
 84  HammingCorr, L = 1200          1 -  7.2e-6
 85  HammingIndep, L = 30           1.5e-14
 92  Run of bits, r = 20             8.3e-4
 95  AutoCor, d = 30                1.6e-31
 96  AutoCor, d = 10                2.8e-25
 ----------------------------------------------
 All other tests were passed


========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        murmurhash3
 Number of statistics:  144
 Total CPU time:   00:27:08.43
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2                eps  
  3  CollisionOver, t = 2             eps  
  4  CollisionOver, t = 2            6.7e-7
 11  BirthdaySpacings, t = 2        4.7e-42
 13  BirthdaySpacings, t = 4          eps  
 18  ClosePairs NP, t = 2           1.4e-15
 18  ClosePairs mNP, t = 2         3.2e-157
 18  ClosePairs mNP1, t = 2        8.9e-284
 18  ClosePairs mNP2, t = 2        8.0e-123
 18  ClosePairs NJumps, t = 2         eps  
 21  ClosePairsBitMatch, t = 2       4.4e-4
 25  SimpPoker, d = 64                eps  
 26  SimpPoker, d = 64                eps  
 31  Gap, r = 0                       eps  
 33  Gap, r = 0                     2.8e-11
 34  Gap, r = 22                      eps  
 36  Run of U01, r = 15              1.4e-7
 41  MaxOft, t = 5                  2.8e-37
 42  MaxOft, t = 10                 1 - eps1
 43  MaxOft, t = 20                 1 - 1.2e-11
 44  MaxOft, t = 30                 1 - 1.1e-14
 49  AppearanceSpacings, r = 0      1 - eps1
 53  WeightDistrib, r = 16           1.8e-5
 54  WeightDistrib, r = 24            eps  
 ----------------------------------------------
 All other tests were passed


========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        lowbias
 Number of statistics:  144
 Total CPU time:   00:27:03.56
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2                eps  
  2  SerialOver, t = 4                eps  
  3  CollisionOver, t = 2             eps  
  4  CollisionOver, t = 2           7.4e-33
 11  BirthdaySpacings, t = 2       4.6e-170
 13  BirthdaySpacings, t = 4          eps  
 15  BirthdaySpacings, t = 7       1.6e-156
 17  BirthdaySpacings, t = 8        2.3e-53
 18  ClosePairs NP, t = 2           1.2e-13
 18  ClosePairs mNP, t = 2         3.2e-157
 18  ClosePairs mNP1, t = 2        1.6e-180
 18  ClosePairs mNP2, t = 2         1.3e-70
 18  ClosePairs NJumps, t = 2      3.9e-249
 21  ClosePairsBitMatch, t = 2       1.1e-4
 23  SimpPoker, d = 16                eps  
 24  SimpPoker, d = 16                eps  
 25  SimpPoker, d = 64                eps  
 26  SimpPoker, d = 64                eps  
 27  CouponCollector, d = 4           eps  
 29  CouponCollector, d = 16          eps  
 30  CouponCollector, d = 16          eps  
 31  Gap, r = 0                       eps  
 32  Gap, r = 27                      eps  
 33  Gap, r = 0                       eps  
 34  Gap, r = 22                      eps  
 35  Run of U01, r = 0                eps  
 36  Run of U01, r = 15              4.3e-4
 37  Permutation, r = 0              1.1e-6
 41  MaxOft AD, t = 5               5.6e-11
 42  MaxOft AD, t = 10              6.4e-15
 43  MaxOft AD, t = 20              1 -  4.8e-8
 44  MaxOft, t = 30                 1 - 3.1e-15
 44  MaxOft AD, t = 30              1 -  6.0e-9
 46  SampleProd, t = 30              0.9992 
 47  SampleMean                      4.5e-6
 48  SampleCorr                     1 -  6.5e-9
 49  AppearanceSpacings, r = 0        eps  
 52  WeightDistrib, r = 8             eps  
 55  SumCollector                     eps  
 75  Fourier3, r = 20                5.7e-5
 81  HammingWeight2, r = 20          0.9992 
 84  HammingCorr, L = 1200          1 -  9.2e-5
 85  HammingIndep, L = 30            1.2e-6
 96  AutoCor, d = 10                 1.1e-8
 ----------------------------------------------
 All other tests were passed


========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        triple32
 Number of statistics:  144
 Total CPU time:   00:27:30.04
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
 41  MaxOft, t = 5                  1 - 1.8e-13
 42  MaxOft, t = 10                 1 -  1.2e-9
 43  MaxOft, t = 20                 1 -  3.2e-9
 44  MaxOft, t = 30                 1 - eps1
 70  RandomWalk1 H (L = 10000)       7.6e-4
 ----------------------------------------------
 All other tests were passed



Comments





© 2024 Marc B. Reynolds
all original content is public domain under UNLICENSE