\( \newcommand{\matr}[1] {\mathbf{#1}} \newcommand{\vertbar} {\rule[-1ex]{0.5pt}{2.5ex}} \newcommand{\horzbar} {\rule[.5ex]{2.5ex}{0.5pt}} \)
deepdream of
          a sidewalk
Show Question
Color science 1.1 The eye and color perception

Light hits the retina, which consists of rods (detects intensity) and cones (detect frequency). Cones are most concentrated at the fovea (near the back center). There 120 million rods, 6 million cones and 1 million optical fibers. Individual cones cannot distinguish light frequency. The photoreceptors are underneath several layers of cells and are orientated towards the back of the eye (for verterbrates).

The iris is a colored annulus (def: a ring-shaped object, structure, or region) behind the cornea but before the lens. The iris contains radial muscles that allow it to change the size of its inner hole, the pupil. Only light passing through the pupil proceeds further into the eye.

Light passes through the pupil then strikes the transparent crystalline lens. The lens is surrounded by muscles called the ciliary body, which can pull at the sides of the lens. When the ciliary muscles are relaxed, the lens is stretched radially, flattening it and reducing its optical power; the light entering the eye is now brought to a focus as far from the lens as possible. The ciliary muscles tense to exert a compressive force on the lens: its diameter shrinks, the lens becomes thicker, the optical power increases, and the focal point moves closer to the lens. Thus when the muscles are relaxed, the lens has its longest focal length and when the muscles tense, the lens is focused on nearer objects.

Light focused by the lens falls on the retina, a thin layering of cells covering about 200° on the back of the eye. The retina contains two types of photosensitive cells: rods and cones. Cones are primarily responsible for color perception; rods are limited to intensity, though they are ten times more sensitive to light than cones. Rods are physically smaller than cones, so they can be packed denser, giving greater spatial resolution. 

There is a small region at the center of the visual axis known as the fovea, which subtends only 1 or 2° of visual angle. The structure of the retina is roughly radially symmetric around the fovea. The fovea contains only cones, and it is here that we find the densest collection of cones on the surface of the retina. Moving outward from the fovea, rods begin to appear among the cones, and at the edge of the fovea there are more rods than cones (see diagram below). Traveling further on a radial path from the fovea, the rods begin to form rings around each increasingly infrequent cone. The highest density of rods appears at about 20° from the fovea. In total, the human eye
contains about 120 million rods and 6 million cones. Since the optic nerve contains only about 1 million fibers, the eye must perform a lot of processing before the visual signal ever reaches the brain. 


When light is absorbed in a receptor in the retina, the molecules of its photosensitive pigment are excited, and, as a result, a change in electrical potential is produced. The chemical at the heart of this process has the generic name photopigment. The particular photopigment found in rods, rhodopsin, has been studied extensively. It has been found that rhodopsin reacts to light in a bell-shaped curve, centered at about 500 nm (blue-green). Cone photopigments have rarely been extracted from primates.

There are three types of cones in the human eye, typically called S, M, and L (named respectively for their peak response to relatively short, medium, and long wavelengths), with peaks located at roughly 420, 530, and 560 nm. The response curves for these cones (as well as the rods) are asymmetrical; the drop-off at the high-frequency side is sharper than at the low-frequency side. Thus the shorter wavelengths are more readily absorbed than the longer wavelengths for all three ranges. Both rods and cones may be considered the ultimate in visual sensitivity: a single photon carries enough energy to produce the chemical reactions that change the electrical potential at the cell's membrane, signaling the arrival of light at that cell .

The relative abundance of the three types of cone varies considerably from one observer to another, but it is always found that there are many fewer S cones than M and L ; one estimate of their relative abundances is that they are, on average, in the ratios of 40 to 20 to 1 for the L , M ,and S cones, respectively. This rather asymmetrical arrangement is, in fact, very understandable. Because the eye is not corrected for chromatic aberration, it cannot simultaneously focus sharply the three regions of the spectrum in which the L, M ,and S cones are most sensitive, that is, wavelengths of around 580nm, 540nm, and 440nm, respectively. The eye focuses light of wavelength about 560nm, so both the ρ and γ responses correspond to images that are reasonably sharp; the β cones then have to receive an image that is much less sharp, and hence it is unnecessary to provide such a f ine network of β cones to detect it.

Backwards photoreceptors

Surprisingly, the photosensors are not the innermost layer of cells on the inside of the retina. Rather, there are several layers of interconnecting cells on top of the photoreceptors,
blocking the light from the lens. The overall density of these cells is quite low, so most of the incident light gets through. Even more surprising is the fact that the photoreceptors themselves are oriented so that they face the back of the eye rather than the pupil, so light must travel through the body of the photoreceptor before it reaches the photopigment that will trigger a response. These two pieces of physiology have suggested to some that the retina appears to have evolved "insideout" from the structure that we would probably think most efficient. This structure is common to all vertebrates (def: All vertebrates are built along the basic chordate body plan: a stiff rod running through the length of the animal (vertebral column and/or notochord), with a hollow tube of nervous tissue (the spinal cord) above it and the gastrointestinal tract below). In contrast, invertebrate eyes come from an invaginated bubble in the skin. The photoreceptors in invertebrates all face toward the lens, while in all vertebrates they face away from the lens and toward the brain. Spiders are unique in that they have both forms of eyes.

Principle of univariance

The signal carried by the change in membrane potential makes up the entire message sent by a photoreceptor to the rest of the visual system. Thus the only message sent by a rod or cone is that light has arrived and stimulated the photopigment; there is no information transmitted describing the wavelength of the photon. This effect is called the principle of univariance (The principle of univariance states that an individual receptor cell can be excited by different combinations of wavelength and intensity, and therefore cannot differentiate between a change in wavelength and a change in intensity). The likelihood of absorption of a photon by a particular cell is a function of the spectral sensitivity of the receptor and the intensity of the incoming light (e.g., if the receptor is 30% sensitive at some wavelength, any particular photon may not be absorbed, but about 30 of every 100 will). The time-averaged output of a photoreceptor is related to the number of photons received over some recent interval, but there is no way to determine the frequency distribution of these absorbed photons. It is only by combining the results of many photoreceptors with different spectral sensitivities that the visual system is able to reconstruct intensity and color descriptions of the incoming signal; this reconstruction is believed to happen at a very early stage in visual processing. The principle of univariance may at first seem puzzling. Suppose that the eye contained many distinct color sensors with different, narrowly defined bands of absorption. Although they might be as dose-packed as cones, the number of sensors for any particular frequency band in a fixed region would necessarily be fewer than if only three types of cones occupied the space, thereby sacrificing spatial color resolution. The human eye has evolved with a compromise of three sensors, which gives good color sensor density in the retina and a sufficient amount of color information to recompute the spectral information of the incident signal. Either the number of sensors or their density could be theoretically increased at the expense of the other.

Time averaged response

The chemical processes that occur inside a photoreceptor last several milliseconds, and additional photons that strike the receptor during that time add to the overall response. Thus the output of a receptor is really a time-averaged response, an effect called temporal smoothing. In effect, the sensors impose a low-pass filter over their time response, though the cutoff frequency of that filter changes with respect to the background light level: when there is little light arriving, there is little smoothing. The effect of temporal smoothing leads to the way we perceive light that blinks, or flickers. When the blinking is slow, we perceive the individual flashes of light. Above a certain rate, called the critical flicker frequency (or CFF), the flashes fuse together into a single continuous image. Far below that rate we see simply a series of still images, without an objectionable sense of near-continuity. Under the best conditions, the CFF for a human is around 60 Hz [389]. In contrast, a bee has a CFF of about 300 Hz. We note that as with most other visual phenomena, the flicker rate is dependent on many factors.

Signal transmission

The electric potential generated by a photoreceptor travels through a series of relay cells, and eventually results in a series of voltage pulses being transmitted along a nerve fibre to the brain. The rates at which these pulses are produced provide the signal modulation- a higher rate indicating a stronger signal, and a lower rate a weaker signal. Zero signal may be indicated by a resting rate, and rates lower than this can then indicate an opposite signal. The pulses themselves are all of the same amplitude, and it is only their frequency that carries information to the brain. The frequencies involved are typically from a few per second to around 400 per second. It might be thought that, as there are four different types of receptor, the rods and the three different types of cone, there would be four different types of signal, transmitted along four different types of nerve fibre, each indicating the strength of the response from one of the four receptor types. However, there is overwhelming evidence that this is not what happens. While much still remains unknown about the way in which the signals are encoded for transmission, the simple scheme shown below can be regarded as a plausible model. The strengths of the signals from the cones are represented by the symbols, ρ , γ ,and β (L, M, S).

Color constancy
The human visual system can compensate for different illumination conditions (e.g. sunlight vs tungsten bulb) in order to perceive the same colors in many different illumination environments. For the moment, we will assume that this means that the relative responses of the L, M and S being equal for white, grey and black objects regardless of the illumination. For example, humans can preceive grey objects as being grey and white objects as being white if they are placed in bright sunlight and if they are placed in the shade. The human visual system is somehow able to factor out the illumination from the properties of the object.

More: Spectral sensitivity of rods and cones.
Signal transmission (strope frequency limit, saturation, integration).