Date: January 24, 2021

Topic: Weighted averages

Recall

How is a weighted average computed?

What is the difference between reliability and frequency weights?

What is the expectation of the weighted mean equal to?

How can you calculate the variance of the weighted mean from the reliability weights and the variance of the random variable that was sampled?

Give an approximate expression for the expectation of the weighted average computed with frequency weights?

Give an an expression that can be used to calculate the variance of the weighted average?

The weighted average

Instead of calculating a sample mean from a set of identical random variables using the usual formulas we can calculate a weighted average as follows:

$$ \overline{X}w = \frac{\sum{i=1}^N w_i X_i }{\sum_{i=1}^N w_i} $$

As a weighted average is a statistic computed from random variables it is obviously a random variable. Importantly, however, there are two ways we might have generated the weights in this expression:

When you are reweighing an enhanced sampling calculation the weights that appear in any weighted averages are frequency weights.

Reliability weights

If the weights are reliability weights then it is straightforward to show that the expectation of the weighted mean is:

$$ \mathbb{E}(\overline{X}_w) = \mathbb{E}(X) $$

This result can be proved by exploiting the linearity of the expectation operator. You also need to remember that all the random variables the weighted average was computed from are identical. Notice, furthermore, that the expression above holds even if the random variables from which the weighted average was computed are not independent.

It is similarly straightforward to show that if a weighted average is computed with random variables that are both independent and identical then the variance of the resulting quantity is given by:

$$ \textrm{var}(\overline{X}w) = \frac{\textrm{var}(X)\sum{i=1}^N w_i^2}{\left( \sum_{i=1}^N w_i \right)} $$

Proofs for these two results are provided in the following video:

https://www.youtube.com/watch?v=xmR4TeV14ds

Frequency weights

(not assessed in SOR1020)

If the weights in the weighted average are also random variables then one can use the Taylor theorem to approximate the weighted average as:

$$ \mathbb{E}(\overline{X}_w) \approx \mathbb{E}(X) $$

Once again all the (random) weights (the $w_i$) in the weighted average and all the random variables (the $X_i$) must be identically distributed to use the above expression. The expression above also holds when the weighted average is computed from random weights/random variables that are not independent.

If each pair of $w_i$ values and each pair of $X_i$ values are independent then the variance of the weighted average is given (approximately) by:

$$ \textrm{var}(\overline{X}w) = \frac{\sum{i=1}^N w_i^2( X_i - \overline{X}w)^2 }{\left( \sum{i=1}^N w_i\right)^2} $$

The two expressions above are derived in the following video:

https://www.youtube.com/watch?v=VPZaq_LMbKc

Notice, furthermore, that when using the above expression you are not required to assume that the random variables $w_i$ and $X_i$ are independent. To be clear, however, every pair of $w_i$ values and every pair of $X_i$ values in the sum must be both identical and independent.

Lastly, notice that the above variance for the weighted mean cannot be calculated from the unweighted or weighted variance of the random variable. In this regard the behaviour of the weighted mean is markedly different from the variance of the sample mean.

<aside> 📌 SUMMARY: You can compute the expectation and variance of a weighted average using the formulas on the page above. The particular expressions you should use for these quantities will depend on whether or not the weights are random variables or parameters.

</aside>