# Symbol codes

A * binary symbol code * \( C \) for ~~an ensemble~~ a probability space \( (\Omega, \mathcal{A}_x, \mathcal{P}_x) \) is a mapping from the range of events \( \mathcal{A}_x = \{a_1, ..., a_n\} \), to [...]. Let \( x \in \mathcal{A}_x \). \( c(x) \) denotes the codeword corresponding to \( x \) and \( l(x) \) will denote its length, with shorthand \( l_i = l(a_i) \).

The *extended code*, denoted \( C^+ \), is a mapping from [...] to [...] obtained by concatenation:

A code \( C(X) \) is said to be [...] if, under the extended code \( C^+ \), no two distinct strings have the same encoding. So:

or equivalently:

A symbol code is a * prefix code * if [...]. A prefix code is also known as a self-punctuating or instantaneous code, as an encoded string can be decoded from left to right without having to look ahead to subsequent codewords. A prefix code is uniquely decodable.

The expected length \( L(C, X) \) of a symbol code \( C \) for ensemble \( X \) is:

Alternatively written as:

Where \( I = |\mathcal{A}_x| \)