# Statistical testing

Topic Infobox | |
---|---|

Linked pages on this wiki | Tools (1), |

*This article is a stub. You can help the Personal Science Wiki by expanding it.*

**Statistical testing**, or **statistics**, helps to understand whether any observed differences in data are likely to be the result of chance or natural variation in these data. As for science in general, in personal science statistics are useful to establish confidence in any observed differences.

## Common statistical applications for personal science[edit | edit source]

### Comparing the data from two groups: Statistical hypothesis testing[edit | edit source]

A frequent question in personal science is whether an intervention had any measurable impact. In these cases one typically has a data-set that is split into two groups: Measurements taken without any intervention and measurements taken with the intervention. A common statistical method for this situation is **statistical hypothesis testing** (also known as **null hypothesis testing**), in which one aims to determine which of two conflicting hypotheses is correct ^{[1]}. Generally, one proposes a *null hypothesis* which is so called as this one assumes that there is no difference in the data between the groups that one compares. The *alternative hypothesis* is that the data in these two groups differs more than one would expect by pure chance.

A variety of statistical tests exist to perform such null hypothesis testing and which test is the appropriate one depends on the characteristics of the data one has collected. A frequent starting point are the t-test and the Mann–Whitney–Wilcoxon test.

### Are two variables interacting with each other: correlations[edit | edit source]

Another frequent question in personal science is whether to variables have a relationship with each other, i.e. whether they have a **correlation**. A simple example of a potential question that looks for a correlation is "Does my amount of sleep relate to my amount of physical activity?". As for statistical hypothesis testing, a number of methodologies exist to calculate whether data is correlated. A typical starting point can be the Pearson correlation^{[2]}.

### Time series analyses[edit | edit source]

Another common approach in personal science is the tracking of parameters/metrics over time and exploring how they change over time and see if there are any trends, including a wish to forecast future developments. A simple example of this would be taking recordings of body weight and see how it changes. Typically, such analyses are somewhat harder to perform. More details can be found in an article on finding relations between variables in time series.

## Limitations[edit | edit source]

Many forms of statistical testing only aim to answer whether it is likely that any observed differences, correlations etc. are *statistically significant*, that is how unlikely is it these results are the outcome of chance. These don't say how *significant* in a broader sense – for example for the impact or effect size – the results are^{[3]}. In genetic testing extreme examples of this can be found: Individual genetic variants can be strongly statistically significant which are very unlikely to be the result of chance. At the same time the observed effect of a genetic variant frequently only increases e.g. the risk for a disease by a fraction of a percent^{[4]}.

As personal science is often concerned with observing effects that have a real-world impact, statistical testing and statistical significance alone should be considered only one tool in the personal science toolkit. Visualizing data can be a very powerful first step to explore if there are visible differences or correlations in ones data. If those visualizations seem to show a large effect, statistical testing can be used to evaluate how 'trustworthy' those effects are.