You know if someone had asked me this question a few months ago I would have responded that it’s was very unlikely and my facial expression would have looked like the person was eating huge globs of mayonnaise straight out of the jar. Let’s jump backward and ballpark some defs:

- An involution is a function that’s its own inverse: $f\left(f\left(x\right)\right) = f^2\left(x\right) = x$
- A bijection is a function that has an inverse: $f^{-1}\left(f\left(x\right)\right) = f\left(f^{-1}\left(x\right)\right) = x$

where the $n$ in $f^n$ is denoting $n$ iterations of $f$. And I’m using *“bit finalizer”* to mean a bit mixing function like that typically used as the final step of a hash function. I’m going to be using the terminology and heat map figures from the previous post *“Comments on the Avalanche Effect”*.

Now back to the beginning. I have a couple of issues with involutions as bit finalizers. The first is there aren’t many problems that need such a thing (roughly encoding and decoding are the same thing). The second issue it’s more computational work to hit reasonable properties but less that I would have previously thought so here we are…me typing.

Code for this post can be found **here** and for quick eye-balling of functions on **godbolt**.

## The set up

What got me thinking about this problem was a question on twitter by `paniq`

where he wanted to pack a 2D coordinate in a integer, perform an involution and unpack back to 2D. This would allow to “statelessly” swap the data between pairs of coordinates for an entire grid in a random manner. My suggested attempt at a solution was something like this:

This is a pair of bijections (one’s the inverse of the other) in the same form as a LCG. These have the know property that they map odd to even and even to odd. By inspecting the value of the low bit (determine if odd or even) we can choose $f\left(x\right)$ for odd and $f^{-1}\left(x\right)$ for even inputs and this composite function is an involution. The “problem” of `paniq`

has two additional wrinkles. First we functions that work on the same size of the texture being manipulated and small sized make finding a good bit mixer harder. As an example if the texture were 256x256 we’d need an involution for a 16-bit integer. The second trick is the function needs to be parameterized so this swapping operation can been performed differently in some number of way…all of which need to be somewhat independent. These move the problem out of the territory of “fun little thing to think about” to “that sounds like a lot of work”. So I’m going to blissfully ignore these extra requirements.

The visualization of the strict avalanche criterion (**SAC**) bias of the above function:

The left hand is sampled using Sobol and right hand a pseudo-random sequence.

## The plot thickens

Harold Aptroot spiced things up by noting that a similarity transform of an involution is an involution. Sweet! This is something I can work with. Let’s run through the justification in painful details using a matrix like notation where $A$ is a bijection and $B$ an involution:

Bullet point it time:

- We’re saying some function $ABA^{-1}$ is an involution which has a period of 2 so squaring it should yield the identity $I$. Start expanding.
- $A$ is a bijection so it has an inverse $A^{-1}$, multipled together gives $I$ which reduces to a 70s Swedish pop group.
- $B$ is an involution, squaring it reduces to $I$ and finally $A$ and it’s inverse cancel each other out.

## Okay..but does it **do** anything?

The rest of this is just a quick feasibility study to see if we can construct an involution that is competitive against standard methods of bit mixing both in terms of statistical measures and runtime costs. It is pretty much a dump of what I walked through to get a feel for the answer. No real search methods were used..just me hacking stuff and seeing what happened.

First we need a fast involution to be transformed and hopefully one that performs some bit mixing as well. The most obvious choice (at least to me) is a right xorshift by at least half the bit-width of the register. Since we’re using a bit operation $\left(\mathbb{F}_2\right)$ for the involution I want to choose a modulo integer operation for the bijection (alternating fields between each logical operation). So about the cheapest thing we can do here is multiply by an odd integer (the inverse of which can be found by mod inverse). So I simply grabbed an odd constant from L’Ecuyer’s table paper and tweaked the xorshift amount until the bias was low.

This is doing suprisingly well for a short sequence involution. And again the above constants are going to be sub-optimal as are all the functions that follow.

Instead of beefing up the bijection I wanted to see how changing the involution to an xor-rotate would play out. First we need some helper functions:

I kept the multiplicative constants the same and tweaked the xor-rotate choice:

This was a nice little jump, we’re still on the cheap side and at this point I’d seen about enough. To kill off the bias on the low bits I just added (didn’t bother to tweak) right xorshift by half the bitwidth.

At this point we’re looking good and we’re also at about exactly the same runtime cost as current standard methods that are structured like this:

As a final “I want to see what happens” test I swapped out the outer xorshift with an xor-rotate randomly chosen from those that have a period of 32 and a 3 rotate inverse:

## Crush-ing some numbers

This section spews out the results you’d see from running the example code. It measures SAC using three different sampling methods:

- $\text{cn}$: counting numbers (well programmer version thereof aka starting from zero) for low entropy input.
- $\text{ss}$: Sobol sequence to get good coverage of input space.
- $\text{he}$: random numbers for high entropy input.

Each set is sampled $2^{23}$ times. The **max bias** measured as described in the *Avalanche Effect* post then multipled by 100 to give a percent bias of the worst case. Also a goodness-of-fit (**GOF**) test is performed to get a global estimate. Specifically Pearson’s chi-squared is computed, takes the square root and multiplied by 100.

Additionally three batteries of TestU01 are run on the input. Each hash function is treated like a random number generator where we walk through counting numbers and apply the hash.

**Rabbit**: computes 33 statistics from 26 base tests drawing approximately a user defined number of sample (specified in bits). My runs draw 1024 samples of the function so is setup for 1024*32 bits.**SmallCrush**: computes 15 statistics from 10 base tests drawing approximately $51,320,000$ samples**Crush**: computes 144 statistics from 96 base tests drawing approximately $2^{35}$ samples

The reported numbers are how many statistics of the battery failed.

For reference I’ve include 32-bit finalizer of MurmurHash3, xxHash and a pair by Chris Wellons. All except *triple32* are two xorshift-multiplies followed by an xorshift and *triple32* adds an additional xorshift-multiply.

Clicking on the headers sorts the data.

hash | % max bias (cn) | % max bias (ss) | % max bias (he) | GOF (cn) | GOF (ss) | GOF (he) | Rabbit | SmallCrush | Crush |
---|---|---|---|---|---|---|---|---|---|

murmur3 | 0.229263 | 0.518417 | 0.207162 | 0.052966 | 0.092238 | 0.043021 | 1 | 2 | 24 |

xxhash32 | 0.377083 | 0.579166 | 0.433731 | 0.069322 | 0.090209 | 0.066725 | 1 | 2 | 39 |

triple32 | 0.135088 |
0.156140 |
0.130367 | 0.044136 |
0.045361 |
0.034147 |
1 | 0 | 5 |

lowbias | 0.169849 | 0.266051 | 0.122666 |
0.047634 | 0.068301 | 0.040458 | 2 | 2 | 44 |

f2 | 0.409937 | 0.393176 | 0.398207 | 0.054149 | 0.070380 | 0.045054 | 1 | 0 | 13 |

f3 | 0.591612 | 0.445747 | 0.516152 | 0.056496 | 0.052190 | 0.045781 | 1 | 0 | 3 |

The three inital functions are failures but here’s their data as well (minus Crush since it’s a waste of time).

hash | % max bias (cn) | % max bias (ss) | % max bias (he) | GOF (cn) | GOF (ss) | GOF (he) | Rabbit | SmallCrush |
---|---|---|---|---|---|---|---|---|

g0 | 100.000000 | 100.000000 | 100.000000 | 76.090304 | 82.050457 | 76.090831 | 7 | 15 |

f0 | 100.000000 | 100.000000 | 100.000000 | 23.667056 | 38.865972 | 23.663481 | 8 | 14 |

f1 | 100.000000 | 100.000000 | 100.000000 | 20.454587 | 20.904367 | 20.456410 | 1 | 7 |

## Conclusions

I normally don’t have a conclusion section. I talk some trash, you can dig through with a stick and decide if there’s anything you want to take home to try out. In this post I goofing around abit so I’m going to toss out my take-a-way.

- The answer to the title is yes.

Bonus takeaways:

- xor-rotates might be worth investigating for various bit mixing operations
- Chris Wellons’ finalizers perform rather well

## Crush output …so you don’t have to

Running the Crush battery takes about an hour so to save you the PITA here is what you would get:

```
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: f1
Number of statistics: 144
Total CPU time: 00:26:31.17
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 eps
2 SerialOver, t = 4 9.4e-4
3 CollisionOver, t = 2 6.2e-11
4 CollisionOver, t = 2 eps
6 CollisionOver, t = 4 eps
8 CollisionOver, t = 8 eps
10 CollisionOver, t = 20 eps
11 BirthdaySpacings, t = 2 eps
13 BirthdaySpacings, t = 4 2.4e-227
15 BirthdaySpacings, t = 7 eps
16 BirthdaySpacings, t = 8 eps
17 BirthdaySpacings, t = 8 eps
18 ClosePairs NP, t = 2 6.7e-4
18 ClosePairs mNP1, t = 2 2.9e-6
18 ClosePairs mNP2, t = 2 7.3e-5
18 ClosePairs NJumps, t = 2 2.7e-5
23 SimpPoker, d = 16 6.2e-6
24 SimpPoker, d = 16 eps
26 SimpPoker, d = 64 eps
28 CouponCollector, d = 4 eps
30 CouponCollector, d = 16 eps
32 Gap, r = 27 eps
33 Gap, r = 0 6.8e-7
34 Gap, r = 22 eps
36 Run of U01, r = 15 eps
41 MaxOft, t = 5 1 - eps1
41 MaxOft AD, t = 5 1.4e-10
42 MaxOft, t = 10 1 - eps1
43 MaxOft, t = 20 1 - eps1
44 MaxOft, t = 30 1 - eps1
49 AppearanceSpacings, r = 0 eps
50 AppearanceSpacings, r = 20 eps
52 WeightDistrib, r = 8 1.1e-12
54 WeightDistrib, r = 24 eps
63 GCD, r = 0 eps
64 GCD, r = 10 eps
65 RandomWalk1 H (L = 90) eps
65 RandomWalk1 M (L = 90) eps
66 RandomWalk1 H (L = 90) eps
66 RandomWalk1 M (L = 90) eps
66 RandomWalk1 J (L = 90) eps
66 RandomWalk1 R (L = 90) eps
66 RandomWalk1 C (L = 90) eps
67 RandomWalk1 H (L = 1000) eps
67 RandomWalk1 M (L = 1000) eps
67 RandomWalk1 R (L = 1000) eps
67 RandomWalk1 C (L = 1000) eps
68 RandomWalk1 H (L = 1000) eps
68 RandomWalk1 M (L = 1000) eps
68 RandomWalk1 J (L = 1000) 4.0e-12
68 RandomWalk1 R (L = 1000) eps
68 RandomWalk1 C (L = 1000) eps
69 RandomWalk1 H (L = 10000) eps
69 RandomWalk1 M (L = 10000) eps
69 RandomWalk1 R (L = 10000) 1.1e-16
69 RandomWalk1 C (L = 10000) eps
70 RandomWalk1 H (L = 10000) eps
70 RandomWalk1 M (L = 10000) eps
70 RandomWalk1 R (L = 10000) eps
70 RandomWalk1 C (L = 10000) eps
72 LinearComp, r = 29 1 - 4.7e-14
74 Fourier3, r = 0 eps
75 Fourier3, r = 20 eps
77 LongestHeadRun, r = 20 7.2e-4
80 HammingWeight2, r = 0 1 - eps1
81 HammingWeight2, r = 20 1 - eps1
82 HammingCorr, L = 30 1 - eps1
83 HammingCorr, L = 300 1 - eps1
84 HammingCorr, L = 1200 1 - eps1
85 HammingIndep, L = 30 eps
86 HammingIndep, L = 30 eps
87 HammingIndep, L = 300 eps
88 HammingIndep, L = 300 eps
89 HammingIndep, L = 1200 eps
90 HammingIndep, L = 1200 eps
92 Run of bits, r = 20 eps
95 AutoCor, d = 30 eps
96 AutoCor, d = 10 eps
----------------------------------------------
All other tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: f2
Number of statistics: 144
Total CPU time: 00:26:56.26
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 eps
11 BirthdaySpacings, t = 2 6.2e-82
25 SimpPoker, d = 64 6.1e-8
41 MaxOft, t = 5 1 - 5.9e-10
41 MaxOft AD, t = 5 2.9e-9
42 MaxOft, t = 10 1 - eps1
43 MaxOft, t = 20 1 - 1.7e-11
44 MaxOft, t = 30 1 - eps1
49 AppearanceSpacings, r = 0 1 - eps1
52 WeightDistrib, r = 8 2.5e-7
82 HammingCorr, L = 30 1 - 8.4e-6
85 HammingIndep, L = 30 5.2e-4
95 AutoCor, d = 30 9.5e-33
----------------------------------------------
All other tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: f3
Number of statistics: 144
Total CPU time: 00:27:52.02
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
42 MaxOft, t = 10 1 - 9.9e-12
43 MaxOft, t = 20 1 - 4.6e-6
44 MaxOft, t = 30 1 - 2.6e-14
----------------------------------------------
All other tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: xxhash32
Number of statistics: 144
Total CPU time: 00:27:14.14
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 eps
2 SerialOver, t = 4 8.9e-14
3 CollisionOver, t = 2 4.2e-281
11 BirthdaySpacings, t = 2 1.1e-93
13 BirthdaySpacings, t = 4 eps
18 ClosePairs NP, t = 2 2.1e-10
18 ClosePairs mNP, t = 2 1.0e-60
18 ClosePairs mNP1, t = 2 1.5e-66
18 ClosePairs mNP2, t = 2 3.4e-20
18 ClosePairs NJumps, t = 2 6.5e-87
23 SimpPoker, d = 16 eps
24 SimpPoker, d = 16 eps
25 SimpPoker, d = 64 eps
26 SimpPoker, d = 64 eps
28 CouponCollector, d = 4 eps
30 CouponCollector, d = 16 3.7e-9
31 Gap, r = 0 eps
32 Gap, r = 27 eps
33 Gap, r = 0 1.1e-14
34 Gap, r = 22 eps
35 Run of U01, r = 0 6.8e-14
36 Run of U01, r = 15 eps
41 MaxOft, t = 5 1 - eps1
42 MaxOft, t = 10 1 - eps1
43 MaxOft, t = 20 1 - 1.4e-9
44 MaxOft, t = 30 1 - 3.6e-15
48 SampleCorr 6.8e-7
49 AppearanceSpacings, r = 0 eps
52 WeightDistrib, r = 8 1.9e-4
54 WeightDistrib, r = 24 1.3e-12
57 MatrixRank, 60 x 60 0.9993
81 HammingWeight2, r = 20 1 - 3.7e-6
82 HammingCorr, L = 30 1 - 4.4e-5
84 HammingCorr, L = 1200 1 - 7.2e-6
85 HammingIndep, L = 30 1.5e-14
92 Run of bits, r = 20 8.3e-4
95 AutoCor, d = 30 1.6e-31
96 AutoCor, d = 10 2.8e-25
----------------------------------------------
All other tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: murmurhash3
Number of statistics: 144
Total CPU time: 00:27:08.43
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 eps
3 CollisionOver, t = 2 eps
4 CollisionOver, t = 2 6.7e-7
11 BirthdaySpacings, t = 2 4.7e-42
13 BirthdaySpacings, t = 4 eps
18 ClosePairs NP, t = 2 1.4e-15
18 ClosePairs mNP, t = 2 3.2e-157
18 ClosePairs mNP1, t = 2 8.9e-284
18 ClosePairs mNP2, t = 2 8.0e-123
18 ClosePairs NJumps, t = 2 eps
21 ClosePairsBitMatch, t = 2 4.4e-4
25 SimpPoker, d = 64 eps
26 SimpPoker, d = 64 eps
31 Gap, r = 0 eps
33 Gap, r = 0 2.8e-11
34 Gap, r = 22 eps
36 Run of U01, r = 15 1.4e-7
41 MaxOft, t = 5 2.8e-37
42 MaxOft, t = 10 1 - eps1
43 MaxOft, t = 20 1 - 1.2e-11
44 MaxOft, t = 30 1 - 1.1e-14
49 AppearanceSpacings, r = 0 1 - eps1
53 WeightDistrib, r = 16 1.8e-5
54 WeightDistrib, r = 24 eps
----------------------------------------------
All other tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: lowbias
Number of statistics: 144
Total CPU time: 00:27:03.56
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 eps
2 SerialOver, t = 4 eps
3 CollisionOver, t = 2 eps
4 CollisionOver, t = 2 7.4e-33
11 BirthdaySpacings, t = 2 4.6e-170
13 BirthdaySpacings, t = 4 eps
15 BirthdaySpacings, t = 7 1.6e-156
17 BirthdaySpacings, t = 8 2.3e-53
18 ClosePairs NP, t = 2 1.2e-13
18 ClosePairs mNP, t = 2 3.2e-157
18 ClosePairs mNP1, t = 2 1.6e-180
18 ClosePairs mNP2, t = 2 1.3e-70
18 ClosePairs NJumps, t = 2 3.9e-249
21 ClosePairsBitMatch, t = 2 1.1e-4
23 SimpPoker, d = 16 eps
24 SimpPoker, d = 16 eps
25 SimpPoker, d = 64 eps
26 SimpPoker, d = 64 eps
27 CouponCollector, d = 4 eps
29 CouponCollector, d = 16 eps
30 CouponCollector, d = 16 eps
31 Gap, r = 0 eps
32 Gap, r = 27 eps
33 Gap, r = 0 eps
34 Gap, r = 22 eps
35 Run of U01, r = 0 eps
36 Run of U01, r = 15 4.3e-4
37 Permutation, r = 0 1.1e-6
41 MaxOft AD, t = 5 5.6e-11
42 MaxOft AD, t = 10 6.4e-15
43 MaxOft AD, t = 20 1 - 4.8e-8
44 MaxOft, t = 30 1 - 3.1e-15
44 MaxOft AD, t = 30 1 - 6.0e-9
46 SampleProd, t = 30 0.9992
47 SampleMean 4.5e-6
48 SampleCorr 1 - 6.5e-9
49 AppearanceSpacings, r = 0 eps
52 WeightDistrib, r = 8 eps
55 SumCollector eps
75 Fourier3, r = 20 5.7e-5
81 HammingWeight2, r = 20 0.9992
84 HammingCorr, L = 1200 1 - 9.2e-5
85 HammingIndep, L = 30 1.2e-6
96 AutoCor, d = 10 1.1e-8
----------------------------------------------
All other tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: triple32
Number of statistics: 144
Total CPU time: 00:27:30.04
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
41 MaxOft, t = 5 1 - 1.8e-13
42 MaxOft, t = 10 1 - 1.2e-9
43 MaxOft, t = 20 1 - 3.2e-9
44 MaxOft, t = 30 1 - eps1
70 RandomWalk1 H (L = 10000) 7.6e-4
----------------------------------------------
All other tests were passed
```