As we explored the architecture of language in the previous chapter, we began to see
that it is possible to model natural language in spite of its complexity and flexibility.

And yet, the best language models are often highly constrained and application-
specific. Why is it that models trained in a specific field or domain of the language

would perform better than ones trained on general language? Consider that the term
“bank” is very likely to be an institution that produces fiscal and monetary tools in an
economics, financial, or political domain, whereas in an aviation or vehicular domain
it is more likely to be a form of motion that results in the change of direction of an
aircraft. By fitting models in a narrower context, the prediction space is smaller and
more specific, and therefore better able to handle the flexible aspects of language.
The bulk of our work in the subsequent chapters will be in “feature extraction” and
“knowledge engineering” - where we’ll be concerned with the identification of unique
vocabulary words, sets of synonyms, interrelationships between entities, and seman‐
tic contexts. However, all of these techniques will revolve around a central text data‐
set: the corpus.