Here I will only consider pseudorandom point generation. Wellspaced point generation is a different topic. The code examples are intended for clarity.
Uniform points on the unit disc
One method to generate a uniform random point $p$ on the unit disc ($\mathbb{D}$) is as follows. Generate two uniform values: $a \in \left[ 0,1 \right] $ , $ \theta \in \left[ \pi, \pi \right) $ and apply^{1}:
Using symmetry and trig identities we can drop this down to one trig approximation and pair of square roots.
Rejection method
Choose a 2D point $p$ uniformly in each dimension: $p_x,p_y \in \left[ 1,1 \right] $ and reject values when outside the unit disc. The average number of iterations is $\frac{4}{\pi} \approx 1.27324$. Although previously known the first CS publication of this technique seems to be von Neumann 1951^{2}.
An example implementation that returns $p \cdot p$ as well as the generated point:
By dropping one of the range remapping of $x$ or $y$ gives a uniform distribution on the half disc and dropping both the quarter disc. The efficiency of the these are identical to the full disc case.
Adhoc nonuniform
If we rescale the component by some function of the dot product $d = (p \cdot p)$ we can get radial adhoc nonuniform distributions.
Directly scaling by the dotproduct:
Scaling by $\frac{d}{1+d}$:
Uniform points on unit circle or uniform 2D rotation
Like the disc case we can generate a uniform point on the unit circle (1sphere or $\mathbb{S}^1$) using trigonometry and symmetry/identities to reduce runtime complexity.
In the same paper^{2} as the disc rejection method John von Neumann presented a pair of techinques to transform uniform points on the unit disc to uniform points on the circle. If we have a uniform point $p = \left(x,y \right)$ (considered as a complex number) on the unit disc, then the first method is to simply normalize:
The second method observes that squaring drops the square root and since it doubles the angle maintains uniformity.
The second method is widely known since it trades a square root for two products and two additions which historically was an assured win. With the introduction of fast inverse square root approximations the lesser known normalization method becomes interesting. Also choosing between zero, one or two NR iterations give a knob of speed vs. bias of the result.
The remaining issue with a practical implementation is the degenerate case. I’ll assume that uniform float values are generated by “equidistance” method (pretty much the only one used). This method will generates values: $\frac{u}{2^{24}}$ where $u$ is a uniform integer on $\left[ 0, 2^{24} \right)$. The only problem result is then when both are zero. We can correct inverse square root by introducing a bias to the dot product which is sufficiently small that only contributes when both are zero such as $2^{72}$. We also need to project that input to unit circle which requires choosing the branch point for the origin. Let’s send it to $(1,0)$ which needs adding a bias to the $x$ coordinate. This gives us:
Since we’re simply normalizing we can create half and quarter circles as per same disc above.
Adhoc nonuniform
Taking the uniform variants and applying some simple transforms can yield some potentially interesting nonuniform distributions. If we take a uniform disc, translate it by one in x and then normalize we get a distribution that is roughly normal (standard deviation ~0.615113) on $\left[\frac{\pi}{2},\frac{\pi}{2}\right] $ Using the angledoubling version, well, doubles the angle output range.
Uniform points on the 3D unit sphere
In 1972 George Marsaglia^{3} introduced a method to transform a uniform point on the unit disc to a uniform point on the 3D sphere $\left(\mathbb{S}^2\right)$:
I walked through to this equation in the post “Implied normals (unit bivectors as rotations)”^{4} where it’s described as the sequence: orthogonal project disc to halfsphere followed by a doubleangle transform. Some bullet points are:
 an areapreserving map from disc to sphere (equivalent to Lamberts)
 radial lines (through origin) on disc are mapped to longitude circles on sphere (angle measure maintained)
 circles are mapped to latitudes (see below)
 $\left( 0, ~0 \right) $ is mapped to the north pole
 the disc: $r \in \left[ 0, ~\frac{1}{\sqrt{2}} \right] $ is mapped to the positive half sphere $\left( \mathbf{z} \geq 0 \right) $
 the annulus: $r \in \left( \frac{1}{\sqrt{2}}, ~1 \right) $ is mapped to the negative half sphere $\left( \mathbf{z} < 0 \right )$
 all points on the unit circle are mapped to south pole
Given how points are mapped from the disc to the sphere, if we knew what “what circle maps to which latitude” we could form rotationally symettric distributions on the sphere directly from those on the disc. Dropping the angledoubling map we end up with a cosine distribution on the halfsphere.
Spherical caps
As with nonuniform distributions, given the way the disc is mapped to the sphere we can directly formulate uniform points on spherical caps. The areas of the disc, sphere and spherical cap (height $h$) are respectively:
So the mapping from disc to sphere involves a constant factor of four and the area of the cap is linear with height. So we can form a point set on the unit disc, apply a uniform scaling and apply the disc to sphere mapping. The cap has a radius of one, divide out the constant factor of the disc to sphere map and relate the areas:
Solving for $r$ give us:
Plugging this scale factor into the map gives us the direct formulation:
Extra examples
Started this section to be able to add extra examples.
Cosine distribution on halfcircle
Generating a uniform line and projecting to circle results in $P(\theta) = \frac{1}{2}\cos(\theta)$ with $\theta \in \left[\frac{\pi}{2}, \frac{\pi}{2} \right]$
Squaring this result converts the halfcircle to circle with: $P(\theta) = \frac{1}{4}\cos(\frac{\theta}{2})$
Cosine distribution on halfsphere
Seemingly the most common fragment shader version is as follows:
Note that the basis computation is degenerate as ‘n’ approaches $\pm \frac{1}{2}\left(0, \sqrt{2}, \sqrt{2}\right)$ and this requires the otherwise unneeded normalization of the final result to reduce the degenerate zone.
So one option is use a method without a degenerate case^{5}:
Another option if the direction will never approach some fixed (say negative $z$) is the method described by Colin BarreBrisebois and Stephen Hill^{6}.
References and Footnotes

“Various Techniques Used in Connection With Random Digits”, John von Neumann, 1951 (PDF) ↩ ↩^{2}

“Choosing a Point from the Surface of a Sphere”, George Marsaglia, 1972 (PDF) ↩

“Implied normals (unit bivectors as rotations)”, (local post) ↩

“Orthonormal basis from normal via quaternion similarity”, (local post) ↩

“Blending in Detail”, Colin BarreBrisebois and Stephen Hill, 2012 (link) ↩