# Languages, terms and formulas

This page defines languages, first order languages, terms, formulas and sentences as used in modal theory.

### Language

A language is an infinite collection of distinct symbols.

A *language* in model theory is similar to an alphabet in
computation theory. They are both collections of symbols; however, model theory
permits a countably infinite set of symbols.

The reverse side discusses the encoding of a language.

### First-order language

A first-order language is a language with the following symbols:

- 1 a. Parentheses
- \( \symbolq{(}, \symbolq{)} \)
- 1 b. Connectives
- \( \symbolq{\land}, \symbolq{\lnot} \)
- 1 c. Quantifier
- \( \symbolq{\forall} \)
- 2. Variables
- countable symbols \( \symbolq{v_1}, \symbolq{v_2}, \symbolq{v_3}, ... \)
- 3. Constant symbols
- zero or more (at most countable) symbols \( \symbolq{c_1}, \symbolq{c_2}, ...\)
- 4. Function symbols
- for every natural \( n > 0 \), zero or more (at most countable) \( n \)-arity symbols \( \symbolq{f_1^{n}}, \symbolq{f_2^{n}}, ... \)
- 5. Relation symbols
- for every natural \( n > 0 \), zero or more (at most countable) \( n \)-arity symbols \( \symbolq{R_1^{n}}, \symbolq{R_2^{n}}, ... \)

This definition is a slight alteration of that given by Leary and Kristiansen. There is much to say about this definition in terms of its encoding (covered on the back side).

Modal theory doesn't pause to give a definition for the set of all possible
strings formed from symbols of a model theory language. This would
be the *language* in computation theory terms. Instead we go straight to
carving out this space to form much smaller sets of strings: *terms*
and *formula*.

### Terms of a first-order language

Let \( \mathcal{L} \) be a first-order language. A *term* of
\( \mathcal{L} \) is defined recursively as:

- a variable symbol or a constant symbol, or
- a sequence of symbols where the first symbol is a function symbol and the remaining symbols are terms. The number of terms must match the arity of the function.

To be percise, the above definition is an algorithm for checking if a symbol or sequence of symbols is a term. If instead we wished for a set definition, we could define it as the union of recursively defined sets.

Another recursive definition is employed to define the *formulas* of a first-order language.

### Formulas of a first-order language

Let \( \mathcal{L} \) be a first-order language. A *formula* of \(
\mathcal{L} \) is a sequence of symbols of \( \mathcal{L} \) defined
recursively. Each formula must be one of the following:

- A sequence of symbols where the first symbol is a relation symbol, and the remaining symbols are terms. The number of terms must match the arity of the relation symbol.
- \( \lnot ( \alpha) \) for some formula \( \alpha \).
- \( ( \alpha \lor \beta ) \) for some formulas \( \alpha \) and \( \beta \).
- \( \forall v \; (\alpha) \) for some variable symbol \( v \) and some formula \( \alpha \).

Similar to the definition of terms, the definition for a language reads as an algorithm for checking if a sequence of symbols of \( \mathcal{L} \) is a formula.

### Decidability

It seems that the symbols of a language can be enumerated. Also, it seems that it can be decided if a symbol sequence is a term or not, and decided if a symbol sequence is a formula or not. I think this is true as if it were not the case, there would be valid formula that can't be decoded in finite time.

### Encoding

The definitions of languages above leaves out subtle details about what each symbol must communicate.

#### Encoding. General language.

In an information theory setting, a symbol could be thought of as being a unique binary sequence. If we wish a machine to read a sequence of such binary sequences, we must either make each symbol encoding prefix free, or we must give the machine an extra symbol to denote the separation. For a binary sequence \( x \), an example prefix free encoding \( \overline{x} \) is the sequence

where \( l(x) \in \mathbb{N} \) and \( 1^{l(x)} \) is \( l(x) \) consecutive '1' symbols.

#### Encoding. First-order language.

The first-order language inherits the uniquely decodable interpretation
above. It seems possible to ignore the semantic labels such as "parentheses",
"connectives" etceteraâ€”they are present only to convey to the reader what type
of algorithms might rely on these symbols. However, when attempting to do away
with these labels, it becomes obvious that we cannot ignore the separation of
the 4 *collections* of symbols. More must be said about them. Implicit
in the definition is the condition that if given a symbol, it is possible
to determine the collection from which a symbol is part of: whether the symbol is a variable, constant, function or relation. We don't need
to worry about the other 5 symbols, as their finite-ness allows a computer to
switch on their encoding. But, for the infinite collections, we are forced to
encode some additional information along with the symbols. The situation is
even more involved for the function and relational symbols. As will be seen
later, it is expected that from a symbol we can determine its arity; without
this information, we cannot check if a term is valid. One might argue that
the variables too should be accompanied by their index.

An argument against the these claims that accompanying information is required is as follows. We could design an enumerator for the symbols of the language in such a way that the index and arity of terms can be deduced by the order in which the symbols are enumerated. This would allow a finite (but horribly slow) method to find this information from a given symbol by pattern matching against every output of the enumerator.

If we had neither accompanying information nor a smartly enumerated language, then it would be possible to encounter a symbol for which no finite computation could determine the type of the symbol or the arity, in the case of function symbols and relation symbols.

### Example

The single element set, containing one binary relation symbol, \( \{ \in \} \) is sufficient to form the first-order language of set theory. All other symbols can be build from this one.