How software gets color wrong

Most software around us today are decent at accurately displaying colors. Processing of colors is another story unfortunately, and is often done badly.

To understand what the problem is, let’s start with an example of three ways of blending green and magenta:

Perceptual blend – A smooth transition using a model designed to mimic human perception of color. The blending is done so that the perceived brightness and color varies smoothly and evenly.Linear blend – A model for blending color based on how light behaves physically. This type of blending can occur in many ways naturally, for example when colors are blended together by focus blur in a camera or when viewing a pattern of two colors at a distance.sRGB blend – This is how colors would normally be blended in computer software, using sRGB to represent the colors. The example only reads correctly if you have a reasonably calibrated display and trichromatic color vision.

As you can see, the sRGB example looks quite different. As the colors blend they shift to be much darker and bluer. Why is the standard way for blending colors different from both human perception and the physical behavior of light?

As it turns out, the reason is that the color representation used in most software is based on how CRT displays in the 1990s worked. How color processing works is just a consequence of that representation. Because of this, working with color in software is harder than it needs to be and there are many unnecessary pitfalls working with image processing.

The rest of this post will cover the basics of how color perception works, how it is modelled in software and how almost all software, doing any sort of color manipulation, gets color wrong.

In a series of follow up posts I will look at better ways of doing processing with color.

Skip this section if you are familiar with things like cone responses and the CIE 1931 XYZ color space

Color is a peculiar thing. Everyday and all around us we perceive color. It wouldn’t be odd to expect color to be simple to explain and that relating it to the world around us would be easy. Instead color perception is a complicated process that takes a spectrum of light through complex and non-linear transformations. Thankfully, centuries of research has produced a decent understanding of this process and models for predicting perception exist.

One key result is that our eyes don’t measure the full spectrum of light. Photoreceptor cells in the eye react to certain ranges of wavelengths, roughly corresponding to what we perceive as colors red, green and blue. The cells reacting to color are called cones. The different types of cones are referred to based on what wavelengths they are sensitive to: long, medium and short or LMS – long corresponds to red, medium to green and short to blue.

Light sensitivity for L, M and S cones for different wavelengths

A consequence of this is that many different light spectrums can result in the same perceived color, as long as the LMS response is the same. This is the basis for many everyday things: e.g. color displays produce a wide range of colors by combining red, green and blue light. Print works similarly by using cyan, magenta and yellow ink and color paint is made by mixing various pigments to achieve the desired color.

This also makes it possible to describe colors mathematically, by describing it as a mix of three primary colors. A standard exists for this called the CIE 1931 XYZ color space, which represents colors as three values: X, Y and Z

Another important result about human perception is that the mapping between perceptual qualities such as hue, brightness, colorfulness, etc. is not a fundamental property of light. The only way to numerically model them is by fitting non-linear models empirically to match human perception or by simulating human biology.

To describe colors in software, conventions for describing colors with numerical values is needed. A specific way of mapping colors to numbers is called color space. Normally color spaces use three numbers to describe colors. RGB color spaces are the most common, with the three values representing red, green and blue light.

To describe a typical RGB color spaces a few things are needed: a set of red, green and blue primary colors and a non linear transfer function (sometimes referred to as gamma correction function). To represent a color in a color space two steps are performed:

The light intensities for each of the three primary colors are calculated so that the perceived color matches the desired color. If the desired color is already represented as intensities of another set primary colors (such as the standard CIE 1931 XYZ color space), this transformation can be done with a 3x3 matrix multiplication:The calculated intensity for each primary is normalized and transformed using the non linear transfer function.\begin{pmatrix} R \\ G \\ B \end{pmatrix} = \begin{pmatrix} f(R_{linear}) \\ f(G_{linear}) \\ f(B_{linear}) \end{pmatrix}=A common choice of the non linear transfer function is a power function: f(x)=x, with γ=1/2.2.

Examples of color spaces structured like this are: sRGB, Adobe RGB, P3, Rec2020 (although they don’t all use a power function).