Everything was went over in risk-ratios-workbook.ipynb and the answers are in risk-ratios.ipynb which can be found in this repo: https://github.com/jstray/risk-ratios
I recommend going through this if you are interested but I will copy the most helpful pieces below.
A risk ratio, also called a relative risk, is the ratio of two probabilities. Each of these probabilities represents something happening to one of two groups.
Given two groups and two outcomes, we can place these four numbers in a table like this:
| Group | Positive | Negative |
|---|---|---|
| Treated | a | b |
| Untreated | c | d |
This is called a cross table or contingency table. The two outcomes are usually called positive and negative, even though the "positive" outcome might actually be bad, like a heart attack. The two groups are called various names, such as the untreated and treated for studies of drugs or other interventions, or sometimes unexposed and exposed when studying the effects of some risk factor.
Then the risk ratio is defined as (a/(a+b)) / (c/(c+d)). This is also called the relative risk.
The risk difference is (a/(a+b)) - (c/(c+d)).
There is another quantity called the odds ratio which is calculated as the ratio of odds instead of a ratio of probabilities, that is (a/b) / (c/d).
How should we write about this? Relative risk is often written as times as likely, so we could say "People who take this medicine are 0.5 times as likely to have a heart attack."
In this case, we could also go with a nice clean "half as likely," but times is the general wording. Consider this sentence reporting a risk ratio from a 2015 ProPublica story: "Young black males in recent years were at a far greater risk of being shot dead by police than their white counterparts – 21 times greater."
You could also report the absolute risk reduction by saying "Those who took the medication were 2.5% less likely to have a heart attack." This gives a different picture of what has happened to the risk. It has decreased by only a small amount, but then again it can't decrease by more than the baseline of 5%.
Typically, a risk ratio is reported as times as likely, which implies a multiplication -- we are multiplying the risk of the untreated group by the relative risk to find the risk of the treated group. Conversely, risk difference typically is reported as less likely or more likely than because it implies addition -- we add the risk difference to the untreated group to get the risk for the treated group.
You may be tempted to write "Those who took the medication were 50% less likely to have a heart attack." This has a nice ring to it, and technically that 50% is a number called the relative risk reduction which is just 1-risk ratio (if the risk ratio was 80% then the relative risk reduction would be 20%). However, this is confusing because "less likely" is usually used to report absolute risk reduction. Please don't do this.
Both risk ratios and risk differences are ways of summarizing a change in risk. They're very useful for comparing different interventions. However, neither of these numbers alone really tells the whole story.
Careful with causality! It is very tempting to write "Taking the medicine reduced the risk of heart attacks from 5% to 2.5%" which means that the drug caused the reduction. This might be true, if the risk ratio was computed as part of a carefully controlled experiment, as when reporting on a scientific study. But in general, risk ratios are statements of correlation, not causation.