Under this uniform language model, the perplexity is equal to the size of the vocabulary. Generally, perplexity captures the effective vocabulary size under the model. For instance, a trigram model described above has a factual branching factor of 109, even though it operates over the vocabulary of 19,979.

