One critical component in Gibbs sampling for complex graphical models is to be able to draw samples from discrete distributions. Take latent Dirichlet allocation (LDA) as an example, the main computation focus is to draw samples from the following distribution:

\begin{equation}

P(z_{i} = k \, | \, \mathbf{z}_{-i}, \mathbf{w}, \boldsymbol{\alpha}, \boldsymbol{\beta}) \propto (n_{d,k}^{-i} + \alpha_{k}) \frac{n_{w_{i}, k}^{-i} + \beta_{w_{i}}}{n_{., k}^{-i} + \sum_{v} \beta_{v}}\label{eq:lda}

\end{equation}

where \( n_{d,k}^{-i} \) is the number of tokens in the document \( d \) assigned to the topic \( k \), excluding the token \( i \), \( n_{w_{i}, k}^{-i} \) is the number of times token \( w_{i} \) assigned to the topic \( k \), excluding \( i \), and \( n_{.,k}^{-i} \) is the total number of tokens assigned to the topic \( k \).

So, a straightforward sampling algorithm works as follows:

- Let \( c_{k} \) be the right-hand side of Equation \eqref{eq:lda} for topic \( k \), which is an
*un-normalized*probability. - We compute the accumulated weights as: \( C[i] = C[i-1] + c_{i} , \,\, \forall i \in (0, K-1] \) and \( C[0] = c_{0} \).
- Draw \( u \sim \mathcal{U}(0, C[K-1] ) \) and find \( t = \arg\min_{i} \left( x_{i} \right) \) where \( x_{i} = C[i] – u \) and \( x_{i} > 0 \)

The last line is essentially to find the minimum index that the array value is greater than the random number \( u \).

One difficulty to deal with \eqref{eq:lda} is that the right hand side might be too small and therefore overflow (thinking about too many near-zero numbers multiplying). Thus, we want to deal with probabilities in *log-space*. We start to work with:

\begin{equation}

\log P(z_{i} = k \, | \, \mathbf{z}_{-i}, \mathbf{w}, \boldsymbol{\alpha}, \boldsymbol{\beta}) \propto \log (n_{d,k}^{-i} + \alpha_{k}) + \log (n_{w_{i}, k}^{-i} + \beta_{w_{i}}) – \log ( n_{., k}^{-i} + \sum_{v} \beta_{v} )\label{eq:log_lda}

\end{equation}and store in \( c_{k} \) but remember that now each value represents *un-normalized* log probability. The next step is to compute the accumulated weights, this time as accumulated probabilities but in log-space! Thanks to the trick mentioned in [Notes on Calculating Log Sum of Exponentials], we are able to compute *log sum* efficiently. Please use the last equation there to compute the accumulated weights. The last step is to draw the random number. We computeĀ \( u \sim \mathcal{U}(0, 1) + \log C[K-1] \) and again find the minimum index that satisfy the array value is greater than \( u \).

Notes:

- The log-sampling algorithm for LDA is implemented in Fugue Topic Modeling Package.
- Unless you really face the issue of overflow, sampling in log-space is usually
than the original space as**much slower***log*and*exp*are expensive functions to compute.