\( \newcommand{\matr}[1] {\mathbf{#1}} \newcommand{\vertbar} {\rule[-1ex]{0.5pt}{2.5ex}} \newcommand{\horzbar} {\rule[.5ex]{2.5ex}{0.5pt}} \)
deepdream of
          a sidewalk
Show Question
\( \newcommand{\cat}[1] {\mathrm{#1}} \newcommand{\catobj}[1] {\operatorname{Obj}(\mathrm{#1})} \newcommand{\cathom}[1] {\operatorname{Hom}_{\cat{#1}}} \newcommand{\multiBetaReduction}[0] {\twoheadrightarrow_{\beta}} \newcommand{\betaReduction}[0] {\rightarrow_{\beta}} \newcommand{\betaEq}[0] {=_{\beta}} \newcommand{\string}[1] {\texttt{"}\mathtt{#1}\texttt{"}} \newcommand{\symbolq}[1] {\texttt{`}\mathtt{#1}\texttt{'}} \)
Math and science::Theory of Computation::Modal theory

Languages, terms and formulas

This page defines languages, first order languages, terms, formulas and sentences as used in modal theory.


A language is an infinite collection of distinct symbols.

A language in model theory is similar to an alphabet in computation theory. They are both collections of symbols; however, model theory permits a countably infinite set of symbols.

The reverse side discusses the encoding of a language.

First-order language

A first-order language is a language with the following symbols:

1 a. Parentheses
\( \symbolq{(}, \symbolq{)} \)
1 b. Connectives
\( \symbolq{\land}, \symbolq{\lnot} \)
1 c. Quantifier
\( \symbolq{\forall} \)
2. Variables
countable symbols \( \symbolq{v_1}, \symbolq{v_2}, \symbolq{v_3}, ... \)
3. Constant symbols
zero or more (at most countable) symbols \( \symbolq{c_1}, \symbolq{c_2}, ...\)
4. Function symbols
for every natural \( n > 0 \), zero or more (at most countable) \( n \)-arity symbols \( \symbolq{f_1^{n}}, \symbolq{f_2^{n}}, ... \)
5. Relation symbols
for every natural \( n > 0 \), zero or more (at most countable) \( n \)-arity symbols \( \symbolq{R_1^{n}}, \symbolq{R_2^{n}}, ... \)

This definition is a slight alteration of that given by Leary and Kristiansen. There is much to say about this definition in terms of its encoding (covered on the back side).

Modal theory doesn't pause to give a definition for the set of all possible strings formed from symbols of a model theory language. This would be the language in computation theory terms. Instead we go straight to carving out this space to form much smaller sets of strings: terms and formula.

Terms of a first-order language

Let \( \mathcal{L} \) be a first-order language. A term of \( \mathcal{L} \) is defined recursively as:

  1. a variable symbol or a constant symbol, or
  2. a sequence of symbols where the first symbol is a function symbol and the remaining symbols are terms. The number of terms must match the arity of the function.

To be percise, the above definition is an algorithm for checking if a symbol or sequence of symbols is a term. If instead we wished for a set definition, we could define it as the union of recursively defined sets.

Another recursive definition is employed to define the formulas of a first-order language.

Formulas of a first-order language

Let \( \mathcal{L} \) be a first-order language. A formula of \( \mathcal{L} \) is a sequence of symbols of \( \mathcal{L} \) defined recursively. Each formula must be one of the following:

  1. A sequence of symbols where the first symbol is a relation symbol, and the remaining symbols are terms. The number of terms must match the arity of the relation symbol.
  2. \( \lnot ( \alpha) \) for some formula \( \alpha \).
  3. \( ( \alpha \lor \beta ) \) for some formulas \( \alpha \) and \( \beta \).
  4. \( \forall v \; (\alpha) \) for some variable symbol \( v \) and some formula \( \alpha \).

Similar to the definition of terms, the definition for a language reads as an algorithm for checking if a sequence of symbols of \( \mathcal{L} \) is a formula.


It seems that the symbols of a language can be enumerated. Also, it seems that it can be decided if a symbol sequence is a term or not, and decided if a symbol sequence is a formula or not. I think this is true as if it were not the case, there would be valid formula that can't be decoded in finite time.


The definitions of languages above leaves out subtle details about what each symbol must communicate.

Encoding. General language.

In an information theory setting, a symbol could be thought of as being a unique binary sequence. If we wish a machine to read a sequence of such binary sequences, we must either make each symbol encoding prefix free, or we must give the machine an extra symbol to denote the separation. For a binary sequence \( x \), an example prefix free encoding \( \overline{x} \) is the sequence

\[ \overline{x} := 1^{l(x)}0x \]

where \( l(x) \in \mathbb{N} \) and \( 1^{l(x)} \) is \( l(x) \) consecutive '1' symbols.

Encoding. First-order language.

The first-order language inherits the uniquely decodable interpretation above. It seems possible to ignore the semantic labels such as "parentheses", "connectives" etcetera—they are present only to convey to the reader what type of algorithms might rely on these symbols. However, when attempting to do away with these labels, it becomes obvious that we cannot ignore the separation of the 4 collections of symbols. More must be said about them. Implicit in the definition is the condition that if given a symbol, it is possible to determine the collection from which a symbol is part of: whether the symbol is a variable, constant, function or relation. We don't need to worry about the other 5 symbols, as their finite-ness allows a computer to switch on their encoding. But, for the infinite collections, we are forced to encode some additional information along with the symbols. The situation is even more involved for the function and relational symbols. As will be seen later, it is expected that from a symbol we can determine its arity; without this information, we cannot check if a term is valid. One might argue that the variables too should be accompanied by their index.

An argument against the these claims that accompanying information is required is as follows. We could design an enumerator for the symbols of the language in such a way that the index and arity of terms can be deduced by the order in which the symbols are enumerated. This would allow a finite (but horribly slow) method to find this information from a given symbol by pattern matching against every output of the enumerator.

If we had neither accompanying information nor a smartly enumerated language, then it would be possible to encounter a symbol for which no finite computation could determine the type of the symbol or the arity, in the case of function symbols and relation symbols.


The single element set, containing one binary relation symbol, \( \{ \in \} \) is sufficient to form the first-order language of set theory. All other symbols can be build from this one.


TODO: graph branching into structure and deductions then merging again with soundness and completeness


A friendly Introduction to Mathematical Logic, Leary and Kristiansen