Math and science::INF ML AI

# Gaussian distribution

The Gaussian distribution with mean of 0 and standard deviation \( \sigma \) has the form:

\[ \phi(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{x^2}{2\sigma^2}} \]

Imagine a distribution that describes errors in measurements of some unknown quantity. Let's say we make the following assumptions:

- Small errors are more likely than large errors.
- For any real, \( v \), the likilihood of errors of magnitude \( v \) and \( -v \) are equal.
- In the presence of several measurements of some unknown quantity, the most likely value of the quantity being measured is their average.

On the basis of these assumptions, we can conclude:

- The distribution has a maximum at zero and decreases with increasing distance from zero.
- The distribution is symmetric (around zero).
- The maximum likelihood estimate (MLE) of the unknown quantity given some measurements is the mean of the measurements.

^{rd}assumption by using a different assumption: assume that each individual error is the aggregate of a large number of "elementary" errors.

From these assumptions, it can be shown that the distribution obeys the following differential equation:

\[ \frac{\phi'(x)}{\phi(x)} = kx \text{ for some real k} \]

Integrating with respect to x produces:\[ ln(\phi(x)) = \frac{k}{2}x^2 + c \text{, so } \\

\phi(x) = Ae^{\frac{kx^2}{2}} \]

\phi(x) = Ae^{\frac{kx^2}{2}} \]

Assumption 1) imposes that \( k \) must be negative. So we combine with 2 and set \( \frac{k}{2} = -h^2 \).

The probability distribution must sum to 1. Knowing that:

\[ \int_{\infty}^{\infty}e^{-h^2x^2}dx = \frac{\sqrt{\pi}}{h} \]

It follows that:

\[ \phi(x) = \frac{h}{\sqrt{\pi}}e^{-h^2x^2} \]

Gauss thought of \( h \) as the "precision of the measurement process". We recognize this as the bell curve determined by \( \mu = 0 \) and \( \sigma = \frac{1}{\sqrt{2}h} \):\[ \phi(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{x^2}{2\sigma^2}} \]