\( \newcommand{\matr}[1] {\mathbf{#1}} \newcommand{\vertbar} {\rule[-1ex]{0.5pt}{2.5ex}} \newcommand{\horzbar} {\rule[.5ex]{2.5ex}{0.5pt}} \)
deepdream of
          a sidewalk
Show Question
\( \newcommand{\cat}[1] {\mathrm{#1}} \newcommand{\catobj}[1] {\operatorname{Obj}(\mathrm{#1})} \newcommand{\cathom}[1] {\operatorname{Hom}_{\cat{#1}}} \newcommand{\multiBetaReduction}[0] {\twoheadrightarrow_{\beta}} \newcommand{\betaReduction}[0] {\rightarrow_{\beta}} \newcommand{\betaEq}[0] {=_{\beta}} \newcommand{\string}[1] {\texttt{"}\mathtt{#1}\texttt{"}} \newcommand{\symbolq}[1] {\texttt{`}\mathtt{#1}\texttt{'}} \)
Math and science::INF ML AI

Gaussian distribution

The Gaussian distribution with mean of 0 and standard deviation \( \sigma \) has the form:

\[ \phi(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{x^2}{2\sigma^2}} \]


Imagine a distribution that describes errors in measurements of some unknown quantity. Let's say we make the following assumptions:

  1. Small errors are more likely than large errors.
  2. For any real, \( v \), the likilihood of errors of magnitude \( v \) and \( -v \) are equal.
  3. In the presence of several measurements of some unknown quantity, the most likely value of the quantity being measured is their average.

On the basis of these assumptions, we can conclude:

  1. The distribution has a maximum at zero and decreases with increasing distance from zero.
  2. The distribution is symmetric (around zero).
  3. The maximum likelihood estimate (MLE) of the unknown quantity given some measurements is the mean of the measurements.
Laplace later does away with the 3rd assumption by using a different assumption: assume that each individual error is the aggregate of a large number of "elementary" errors.

From these assumptions, it can be shown that the distribution obeys the following differential equation:

\[ \frac{\phi'(x)}{\phi(x)} = kx \text{ for some real k} \]
Integrating with respect to x produces:
\[ ln(\phi(x)) = \frac{k}{2}x^2 + c \text{, so } \\
\phi(x) = Ae^{\frac{kx^2}{2}} \]

Assumption 1) imposes that \( k \) must be negative. So we combine with 2 and set \( \frac{k}{2} =  -h^2 \).

The probability distribution must sum to 1. Knowing that:

\[ \int_{\infty}^{\infty}e^{-h^2x^2}dx = \frac{\sqrt{\pi}}{h} \]

It follows that:

\[ \phi(x) = \frac{h}{\sqrt{\pi}}e^{-h^2x^2} \]
Gauss thought of \( h \) as the "precision of the measurement process". We recognize this as the bell curve determined by \( \mu = 0 \) and \( \sigma = \frac{1}{\sqrt{2}h} \):
\[ \phi(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{x^2}{2\sigma^2}} \]