A common building block of sampling some random distributions is taking the logarithm of a uniform random floating-point value $u$:

$-\log \left(u\right)$

Let’s jump back and look at how the uniform floating point value $u$ was generated:

So we generate a $b$ bit uniform integer, discard $\left(b-p\right)$ bits (where $p$ is the number of bits in the significand in the floating-point format and $b$ is the number of uniform random bits we have to work with), convert the integer to float and scale by $2^{-p}$.

The generation method above isn’t what we really want since $-\log\left(0\right) = \infty$. We could instead compute $-\log\left(1-u\right)$ but let’s create a new version that yields values on $\left(0,1\right]$:

## Turn-key

Now let’s consider something completely different, the background of which is in a previous post “Higher density uniform floats”:

We have a helper function (pdense_bits) which generates a partially dense uniform floating point value on $\left(0,1\right)$ (as a IEEE bit pattern) and a pair of user functions. The first simply converts the output to a float and the second shifts by 1 ULP to give the output range $\left(0,1\right]$. Let’s spitball where we’re at:

• standard method: generates all representiable FP values on $\left[1/2,1\right]$ and each successively smaller POT interval has half as many samples as the previous. The total number of samples generated is $2^{53}$
• partially dense method: generates all representiable FP values on $\left[2^{-12},1\right]$ and each successively smaller POT interval has half as many samples as the previous. The total number of samples generated is $2^{12}\left(2^{53}+1\right)+1$

If we were to transform these by $-\log_2$ then the output ranges are $\left[0,53\right]$ and $\left[0,65\right]$ respectively. So we explode the total number of samples and extend the tail by about 20.75%. To eyeball the cost of the two transforms see here on godbolt.

## Some assembly required (really quite alot)

In this section I’m going to sketch some things we could do to tweak performance and/or accuracy of the previous. I’m not going to carry though any of it..’cause computing logs in software is a PITA and I’m not really motivated to talk about NA and function approximations.

Let’s consider some basic log identities. First changing the base to base-2:

$\log \left(x\right) = \log \left(2\right) \log_2 \left(x\right)$

and from the product:

$\log_2 \left(2^e~i\right) = e+ \log_2 \left(i\right)$

Now let’s rip apart our helper function from the previous section:

This is just a sketch and I’m only gonna do some basic bullet points.

• We have partially reduced input for log (and we know it’s not a denormal or a special) so we can chop off some low hanging fruit.
• We’re integer up to needing to compute the log. This kinda interesting if we’ve got an accuracy jonze going on. A very popular thing to do is to use Tang’s method. This starts by examining the upper bits: musl does this (SEE: log2.c & log2_data.c) and it’s talked about in “Computing floating-point logarithms with fixed-point operations”.

There’s a just ton of moving parts here unless you’re just going to spitball the result. What’s coming next should be a really big one. Local examples include: which base for the log? What error bound/cost target? Single value or unevaluated-pair result from it? How do we combine all these partial results together?