<aside> 📌 SUMMARY:
</aside>
What does the cumulative probability distribution function measure?
Give three properties that the cumulative probability distribution function has?
How can you obtain an estimate of the cumulative distribution from a data set?
The cumulative probability distribution function of a random variable is a function $P(X\le x)$ which tells you the probability that the random variable $X$ is less than or equal to $x$. This function has the following three properties:
$$ \begin{aligned} \lim_{x\rightarrow -\infty} P(X\le x) & = 0 \\ \lim_{x\rightarrow \infty} P(X\le x) & = 1 \\ \lim_{\epsilon \rightarrow 0} P(X\le x + \epsilon) & = P(X\le x) \end{aligned} $$
One formal definition of this function is as follows:
If each event, s, in the sample space $\Omega$, has a value, $x(s)$, taken from the set of real numbers, $\mathbb{R}$. Then there exists a cumulative probability distribution function, $P(X\le x)$, that maps each subset of $\Omega$ that can be formed using $A(x') = \{s: (s\in\Omega)\wedge (x(s)\le s')\}$ to a number from the set of real numbers, between 0 and 1. This function has the three properties that were given above.
This definition is explained in the following video:
https://www.youtube.com/embed/qbbTEZ4NlCI
If you have repeated results from an experiment you can get information on the cumulative probability distribution function that was sampled in the experiment by sorting the data into ascending order as is discussed in this video:
https://www.youtube.com/embed/VaZTKmcxLvY
The following meanwhile explains how you can use python to plot the cumulative probability distribution using this idea.
https://www.youtube.com/embed/fQ0Iy0Sew_U
Notice, last of all that in python you can calculate the $p$th percentile of the data in a numpy array dataset using:
import numpy as np
percentile = np.percentile( dataset, p )
This function uses linear interpolation as is described in the video below:
https://www.youtube.com/embed/UUbkt9nA3Mc
<aside> 📌 SUMMARY: The cumulative probability distribution function $P(X\le x)$ for a random variable $X$ gives the probability that the random variable is less than or equal to $x$. You can obtain an estimate of this function from a dataset by sorting the data set into ascending order.
</aside>