False discovery rate

The false discovery rate (FDR) is a method of conceptualizing the rate of type I errors in null hypothesis testing when conducting multiple comparisons. FDR-controlling procedures are designed to control the expected proportion of "discoveries" (rejected null hypotheses) that are false (incorrect rejections).^[1] FDR-controlling procedures provide less stringent control of Type I errors compared to familywise error rate (FWER) controlling procedures (such as the Bonferroni correction), which control the probability of at least one Type I error. Thus, FDR-controlling procedures have greater power, at the cost of increased rates of Type I errors.^[2]

History

Technological motivations

The modern widespread use of the FDR is believed to stem from, and be motivated by, the development in technologies that allowed the collection and analysis of a large number of distinct variables in several individuals (e.g., the expression level of each of 10,000 different genes in 100 different persons).^[3] By the late 1980s and 1990s, the development of "high-throughput" sciences, such as genomics, allowed for rapid data acquisition. This, coupled with the growth in computing power, made it possible to seamlessly perform hundreds and thousands of statistical tests on a given data set. The technology of microarrays was a prototypical example, as it enabled thousands of genes to be tested simultaneously for differential expression between two biological conditions.^[4]

As high-throughput technologies became common, technological and/or financial constraints led researchers to collect datasets with relatively small sample sizes (e.g. few individuals being tested) and large numbers of variables being measured per sample (e.g. thousands of gene expression levels). In these datasets, too few of the measured variables showed statistical significance after classic correction for multiple tests with standard multiple comparison procedures. This created a need within many scientific communities to abandon FWER and unadjusted multiple hypothesis testing for other ways to highlight and rank in publications those variables showing marked effects across individuals or treatments that would otherwise be dismissed as non-significant after standard correction for multiple tests. In response to this, a variety of error rates have been proposed—and become commonly used in publications—that are less conservative than FWER in flagging possibly noteworthy observations.

Literature

The FDR concept was formally described by Yoav Benjamini and Yosi Hochberg in 1995^[1] (BH procedure) as a less conservative and arguably more appropriate approach for identifying the important few from the trivial many effects tested. The FDR has been particularly influential, as it was the first alternative to the FWER to gain broad acceptance in many scientific fields (especially in the life sciences, from genetics to biochemistry, oncology and plant sciences).^[3] In 2005, the Benjamini and Hochberg paper from 1995 was identified as one of the 25 most-cited statistical papers.^[5]

Prior to the 1995 introduction of the FDR concept, various precursor ideas had been considered in the statistics literature. In 1979, Holm proposed the Holm procedure,^[6] a stepwise algorithm for controlling the FWER that is at least as powerful as the well-known Bonferroni adjustment. This stepwise algorithm sorts the p-values and sequentially rejects the hypotheses starting from the smallest p-values.

Benjamini (2010)^[3] said that the false discovery rate, and the paper Benjamini and Hochberg (1995), had its origins in two papers concerned with multiple testing:

The first paper is by Schweder and Spjotvoll (1982)^[7] who suggested plotting the ranked p-values and assessing the number of true null hypotheses ( $m_{0}$ ) via an eye-fitted line starting from the largest p-values. The p-values that deviate from this straight line then should correspond to the false null hypotheses. This idea was later developed into an algorithm and incorporated the estimation of $m_{0}$ into procedures such as Bonferroni, Holm or Hochberg.^[8] This idea is closely related to the graphical interpretation of the BH procedure.
The second paper is by Branko Soric (1989)^[9] which introduced the terminology of "discovery" in the multiple hypothesis testing context. Soric used the expected number of false discoveries divided by the number of discoveries $\left(E[V]/R\right)$ as a warning that "a large part of statistical discoveries may be wrong". This led Benjamini and Hochberg to the idea that a similar error rate, rather than being merely a warning, can serve as a worthy goal to control.

The BH procedure was proven to control the FDR in 1995 by Benjamini and Hochberg.^[1] In 1986, R. J. Simes offered the same procedure as the "Simes procedure", in order to control the FWER in the weak sense (under the intersection null hypothesis) when the statistics are independent.^[10] In 1988, G. Hommel showed that it does not control the FWER in the strong sense in general.^[11] Based on the Simes procedure, Yosef Hochberg proposed Hochberg's step-up procedure (1988) which does control the FWER in the strong sense under certain assumptions on the dependence of the test statistics.^[12]

Definitions

Based on definitions below we can define $Q$ as the proportion of false discoveries among the discoveries:

Q=V/R=V/(V+S)

The false discovery rate (FDR) is then simply:^[1]

\mathrm {FDR} =Q_{e}=\mathrm {E} \!\left[Q\right],

where $Q$ is defined to be 0 when $R = 0$ . One wants to keep FDR below a threshold q.

Classification of multiple hypothesis tests

The following table defines the possible outcomes when testing multiple null hypotheses. Suppose we have a number m of null hypotheses, denoted by: H₁, H₂, ..., H_m. Using a statistical test, we reject the null hypothesis if the test is declared significant. We do not reject the null hypothesis if the test is non-significant. Summing each type of outcome over all H_i yields the following random variables:

	Null hypothesis is true (H₀)	Alternative hypothesis is true (H_A)	Total
Test is declared significant	$V$	$S$	$R$
Test is declared non-significant	$U$	$T$	$m - R$
Total	$m_{0}$	$m - m_0$	$m$

$m$ is the total number hypotheses tested
$m_{0}$ is the number of true null hypotheses, an unknown parameter
$m - m_0$ is the number of true alternative hypotheses
$V$ is the number of false positives (Type I error) (also called "false discoveries")
$S$ is the number of true positives (also called "true discoveries")
$T$ is the number of false negatives (Type II error)
$U$ is the number of true negatives
$R=V+S$ is the number of rejected null hypotheses (also called "discoveries", either true or false)

In $m$ hypothesis tests of which $m_{0}$ are true null hypotheses, $R$ is an observable random variable, and $S$ , $T$ , $U$ , and $V$ are unobservable random variables.

Controlling procedures

The settings for many procedures is such that we have $H_1 \ldots H_m$ null hypotheses tested and $P_1 \ldots P_m$ their corresponding p-values. We list these p-values in ascending order and denote them by $P_{(1)} \ldots P_{(m)}$ . A procedure that goes from a small p-value to a large one will be called a step-up procedure. In a similar way, in a "step-down" procedure we move from a large corresponding test statistic to a smaller one.

Benjamini–Hochberg procedure

The Benjamini–Hochberg procedure (BH step-up procedure) controls the FDR at level $\alpha$ .^[1] It works as follows:

For a given $\alpha$ , find the largest $k$ such that $P_{(k)} \leq \frac{k}{m} \alpha.$
Reject the null hypothesis (i.e., declare discoveries) for all $H_{(i)}$ for $i = 1, \ldots, k$ .

The BH procedure is valid when the $m$ tests are independent, and also in various scenarios of dependence.^[13] It also satisfies the inequality:

E(Q) \leq \frac{m_0}{m}\alpha \leq \alpha

If an estimator of $m_{0}$ is inserted into the BH procedure, it is no longer guaranteed to achieve FDR control at the desired level.^[3] Adjustments may be needed in the estimator and several modifications have been proposed.^[14]^[15]^[16]^[17]

Note that the mean $\alpha$ for these $m$ tests is $\frac{\alpha(m+1)}{2m}$ , the Mean(FDR $\alpha$ ) or MFDR, $\alpha$ adjusted for $m$ independent (or positively correlated, see below) tests. The MFDR calculation shown here is for a single value and is not part of the Benjamini and Hochberg method; see AFDR below.

Benjamini–Hochberg–Yekutieli procedure

The Benjamini–Hochberg–Yekutieli procedure controls the false discovery rate under positive dependence assumptions.^[13] This refinement modifies the threshold and finds the largest $k$ such that:

P_{(k)} \leq \frac{k}{m \cdot c(m)} \alpha

If the tests are independent or positively correlated: $c(m)=1$
Under arbitrary dependence: $c(m) = \sum _{i=1} ^m \frac{1}{i}$

In the case of negative correlation, $c(m)$ can be approximated by using the Euler–Mascheroni constant.

\sum _{i=1}^{m}{\frac {1}{i}}\approx \ln(m)+\gamma +{\frac {1}{2m}}.

Using MFDR and formulas above, an adjusted MFDR, or AFDR, is the min(mean $\alpha$ ) for $m$ dependent tests $= \frac\mathrm{MFDR}{c(m)}$ .

The other way to address dependence is by bootstrapping and rerandomization.^[4]^[18]^[19]

Estimating the FDR

Let $\pi _{0}$ be the proportion of true null hypotheses, and $\pi _{1}=1-\pi _{0}$ be the proportion of true alternative hypotheses.^[20] Then $N\pi _{0}$ times the average p-value of rejected effects divided by the number of rejected effects gives an estimate of the FDR.

Properties

Adaptive and scalable

Using a multiplicity procedure that controls the FDR criterion is adaptive and scalable. Meaning that controlling the FDR can be very permissive (if the data justify it), or conservative (acting close to control of FWER for sparse problem) - all depending on the number of hypotheses tested and the level of significance.^[3]

The FDR criterion adapts so that the same number of false discoveries (V) will have different implications, depending on the total number of discoveries (R). This contrasts with the family wise error rate criterion. For example, if inspecting 100 hypotheses (say, 100 genetic mutations or SNPs for association with some phenotype in some population):

If we make 4 discoveries (R), having 2 of them be false discoveries (V) is often very costly. Whereas,
If we make 50 discoveries (R), having 2 of them be false discoveries (V) is often not very costly.

The FDR criterion is scalable in that the same proportion of false discoveries out of the total number of discoveries (Q), remains sensible for different number of total discoveries (R). For example:

If we make 100 discoveries (R), having 5 of them be false discoveries ( $q=5\%$ ) may not be very costly.
Similarly, if we make 1000 discoveries (R), having 50 of them be false discoveries (as before, $q=5\%$ ) may still not be very costly.

The FDR criterion is also scalable in the sense that when making a correction on a set of hypotheses, or two corrections if the set of hypotheses were to be split into two - the discoveries in the combined study are (about) the same as when analyzed separately. For this to hold, the sub-studies should be large with some discoveries in them.

Dependency among the test statistics

Controlling the FDR using the linear step-up BH procedure, at level q, has several properties related to the dependency structure between the test statistics of the $m$ null hypotheses that are being corrected for. If the test statistics are:

Independent:^[13] $\mathrm{FDR} \le \frac{m_0}{m}q$
Independent and continuous:^[1] $\mathrm{FDR} = \frac{m_0}{m}q$
Positive dependent:^[13] $\mathrm{FDR} \le \frac{m_0}{m}q$
In the general case:^[13] $\mathrm {FDR} \leq {\frac {m_{0}}{m}}q/\left(1+{\frac {1}{2}}+{\frac {1}{3}}+\cdots +{\frac {1}{m}}\right)\approx {\frac {m_{0}}{m}}q/(\log(m)+\gamma +{\frac {1}{2m}})$ , where $\gamma$ is the Euler–Mascheroni constant.

Proportion of true hypotheses

If all of the null hypotheses are true ( $m_0=m$ ), then controlling the FDR at level $q$ guarantees control over the FWER (this is also called "weak control of the FWER"): $\mathrm{FWER}=P\left( V \ge 1 \right) = E\left( \frac{V}{R} \right) = \mathrm{FDR} \le q$ , simply because the event of rejecting at least one true null hypothesis $\{V\geq 1\}$ is exactly the event $\{V/R=1\}$ , and the event $\{V=0\}$ is exactly the event $\{V/R=0\}$ (when $V=R=0$ , $V/R=0$ by definition).^[1] But if there are some true discoveries to be made ( $m_0<m$ ) then $FWER \geq FDR$ . In that case there will be room for improving detection power. It also means that any procedure that controls the FWER will also control the FDR.

Related concepts

Related error rates

The discovery of the FDR was preceded and followed by many other types of error rates. These include:

$PCER$ (per-comparison error rate) is defined as: $\mathrm{PCER} = E \left[ \frac{V}{m} \right]$ . Testing individually each hypothesis at level $α$ guarantees that $\mathrm{PCER} \le \alpha$ (this is testing without any correction for multiplicity)
$FWER$ (the family wise error rate) is defined as: $\mathrm{FWER} = P(V \ge 1)$ . There are numerous procedures that control the FWER.
$k{\text{-FWER}}$ (The tail probability of the False Discovery Proportion), suggested by Lehmann and Romano, van der Laan at al, is defined as: $k\text{-FWER} = P(V \ge k) \le q$ .
$k{\text{-FDR}}$ (also called the generalized FDR by Sarkar in 2007^[21]^[22]) is defined as: $k{\text{-FDR}}=E\left({\frac {V}{R}}I_{{(V>k)}}\right)\leq q$ .
$Q'$ is the proportion of false discoveries among the discoveries", suggested by Soric in 1989,^[9] and is defined as: $Q' = \frac{E[V]}{R}$ . This is a mixture of expectations and realizations, and has the problem of control for $m_0=m$ .^[1]
$\mathrm{FDR}_{-1}$ (or Fdr) was used by Benjamini and Hochberg,^[3] and later called "Fdr" by Efron (2008) and earlier.^[23] It is defined as: $\mathrm{FDR}_{-1} = Fdr = \frac{E[V]}{E[R]}$ . This error rate cannot be strictly controlled because it is 1 when $m = m_0$ .
$\mathrm{FDR}_{+1}$ was used by Benjamini and Hochberg,^[3] and later called "pFDR" by Storey (2002).^[20] It is defined as: $\mathrm{FDR}_{+1} = pFDR = E \left[ \left. {\frac{V}{R}} \right| R>0 \right]$ . This error rate cannot be strictly controlled because it is 1 when $m = m_0$ .
False exceedance rate (the tail probability of FDP), defined as:^[24] $\mathrm{P} \left( \frac{V}{R} > q \right)$
$W{\text{-FDR}}$ (Weighted FDR). Associated with each hypothesis i is a weight $w_i \ge 0$ , the weights capture importance/price. The W-FDR is defined as: $W{\text{-FDR}}=E\left({\frac {\sum w_{i}V_{i}}{\sum w_{i}R_{i}}}\right)$ .
$FDCR$ (False Discovery Cost Rate). Stemming from statistical process control: associated with each hypothesis i is a cost $\mathrm{c}_i$ and with the intersection hypothesis $H_{00}$ a cost $c_{0}$ . The motivation is that stopping a production process may incur a fixed cost. It is defined as: $\mathrm{FDCR} = E\left( c_0 V_0 + \frac{\sum c_i V_i }{c_0 R_0 + \sum c_i R_i } \right)$
$PFER$ (per-family error rate) is defined as: $\mathrm{PFER} = E(V)$ .
$FNR$ (False non-discovery rates) by Sarkar; Genovese and Wasserman is defined as: $\mathrm{FNR} = E\left( \frac{T}{m - R} \right) = E\left( \frac{m - m_0 - (R - V)}{m - R} \right)$
$\mathrm{FDR}(z)$ is defined as: $\mathrm{FDR}(z) = \frac{p_0 F_0 (z)}{F(z)}$
$\mathrm{FDR}$ The local fdr is defined as: $\mathrm{FDR} = \frac{p_0 f_0 (z)}{f(z)}$

False coverage rate

The false coverage rate (FCR) is, in a sense, the FDR analog to the confidence interval. FCR indicates the average rate of false coverage, namely, not covering the true parameters, among the selected intervals. The FCR gives a simultaneous coverage at a $1-\alpha$ level for all of the parameters considered in the problem. Intervals with simultaneous coverage probability 1−q can control the FCR to be bounded by q. There are many FCR procedures such as: Bonferroni-Selected–Bonferroni-Adjusted, Adjusted BH-Selected CIs (Benjamini and Yekutieli (2005)),^[25] Bayes FCR (Yekutieli (2008)), and other Bayes methods.^[26]

Bayesian approaches

Connections have been made between the FDR and Bayesian approaches (including empirical Bayes methods),^[23]^[27]^[28] thresholding wavelets coefficients and model selection,^[29]^[30]^[31]^[32] and generalizing the confidence interval into the False coverage statement rate (FCR).^[25]

False positive rates in single tests of significance

Colquhoun (2014)^[33] used the term false discovery rate to mean the probability that a "significant" result was a false positive. This was part of an investigation of the question "how should one interpret the P value found in a single unbiased test of significance". In subsequent work^[34],^[35], Colquhoun called the same thing the false positive rate, rather than the false discovery rate in order to avoid confusion with the use of the latter term in connection with the problem of multiple comparisons. The methods for dealing with multiple comparisons described above aim to control the type 1 error rate. They do nothing to help with the fact that P = 0.05 provides little evidence against the null hypothesis^[35]. The false positive rate is one minus the positive predictive value (PPV), but has the advantage of being more self-explanatory than PPV.

References

1 2 3 4 5 6 7 8 Benjamini, Yoav; Hochberg, Yosef (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing" (PDF). Journal of the Royal Statistical Society, Series B. 57 (1): 289–300. MR 1325392.
↑ Shaffer J.P. (1995) Multiple hypothesis testing, Annual Review of Psychology 46:561-584, Annual Reviews
1 2 3 4 5 6 7 Benjamini, Y. (2010). "Discovering the false discovery rate". Journal of the Royal Statistical Society: Series B (Statistical Methodology). 72 (4): 405–416. doi:10.1111/j.1467-9868.2010.00746.x.
1 2 Storey, John D.; Tibshirani, Robert (2003). "Statistical significance for genome-wide studies" (PDF). Proceedings of the National Academy of Sciences. 100 (16): 9440–9445. Bibcode:2003PNAS..100.9440S. PMC 170937 . PMID 12883005. doi:10.1073/pnas.1530509100.
↑ Ryan, T. P.; Woodall, W. H. (2005). "The most-cited statistical papers". Journal of Applied Statistics. 32 (5): 461. doi:10.1080/02664760500079373.
↑ Holm, S. (1979). "A simple sequentially rejective multiple test procedure". Scandinavian Journal of Statistics. 6 (2): 65–70. JSTOR 4615733. MR 538597.
↑ Schweder, T.; Spjøtvoll, E. (1982). "Plots of P-values to evaluate many tests simultaneously". Biometrika. 69 (3): 493. doi:10.1093/biomet/69.3.493.
↑ Hochberg, Y.; Benjamini, Y. (1990). "More powerful procedures for multiple significance testing". Statistics in Medicine. 9 (7): 811–818. PMID 2218183. doi:10.1002/sim.4780090710.
1 2 Soric, Branko (June 1989). "Statistical "Discoveries" and Effect-Size Estimation". Journal of the American Statistical Association. 84 (406): 608–610. JSTOR 2289950. doi:10.1080/01621459.1989.10478811.
↑ Simes, R. J. (1986). "An improved Bonferroni procedure for multiple tests of significance". Biometrika. 73 (3): 751–754. doi:10.1093/biomet/73.3.751.
↑ Hommel, G. (1988). "A stagewise rejective multiple test procedure based on a modified Bonferroni test". Biometrika. 75 (2): 383. doi:10.1093/biomet/75.2.383.
↑ Hochberg, Yosef (1988). "A Sharper Bonferroni Procedure for Multiple Tests of Significance" (PDF). Biometrika. 75 (4): 800–802. doi:10.1093/biomet/75.4.800.
1 2 3 4 5 Benjamini, Yoav; Yekutieli, Daniel (2001). "The control of the false discovery rate in multiple testing under dependency" (PDF). Annals of Statistics. 29 (4): 1165–1188. MR 1869245. doi:10.1214/aos/1013699998.
↑ Storey, J. D.; Taylor, J. E.; Siegmund, D. (2004). "Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach". Journal of the Royal Statistical Society: Series B (Statistical Methodology). 66: 187. doi:10.1111/j.1467-9868.2004.00439.x.
↑ Benjamini, Y.; Krieger, A. M.; Yekutieli, D. (2006). "Adaptive linear step-up procedures that control the false discovery rate". Biometrika. 93 (3): 491. doi:10.1093/biomet/93.3.491.
↑ Gavrilov, Y.; Benjamini, Y.; Sarkar, S. K. (2009). "An adaptive step-down procedure with proven FDR control under independence". The Annals of Statistics. 37 (2): 619. doi:10.1214/07-AOS586.
↑ Blanchard, G.; Roquain, E. (2008). "Two simple sufficient conditions for FDR control". Electronic Journal of Statistics. 2: 963. doi:10.1214/08-EJS180.
↑ Yekutieli D, Benjamini Y (1999). "Resampling based False Discovery Rate controlling procedure for dependent test statistics". J. Statist. Planng Inf. 82: 171–196. doi:10.1016/S0378-3758(99)00041-5.
↑ van der Laan, M. J. and Dudoit, S. (2007). Multiple Testing Procedures with Applications to Genomics. New York: Springer.
1 2 Storey, John D. (2002). "A direct approach to false discovery rates" (PDF). Journal of the Royal Statistical Society, Series B. 64 (3): 479–498. doi:10.1111/1467-9868.00346.
↑ Sarkar, Sanat K. "Stepup procedures controlling generalized FWER and generalized FDR." The Annals of Statistics (2007): 2405-2420.
↑ Sarkar, Sanat K., and Wenge Guo. "On a generalized false discovery rate." The Annals of Statistics (2009): 1545-1565.
1 2 Efron B (2008). "Microarrays, empirical Bayes and the two groups model". Statistical Science. 23: 1–22. doi:10.1214/07-STS236.
↑ Benjamini, Y. (2010). "Simultaneous and selective inference: Current successes and future challenges". Biometrical Journal. 52 (6): 708–721. PMID 21154895. doi:10.1002/bimj.200900299.
1 2 Benjamini Y, Yekutieli Y (2005). "False discovery rate controlling confidence intervals for selected parameters". Journal of the American Statistical Association. 100 (469): 71–80. doi:10.1198/016214504000001907.
↑ Zhao, Z.; Gene Hwang, J. T. (2012). "Empirical Bayes false coverage rate controlling confidence intervals". Journal of the Royal Statistical Society: Series B (Statistical Methodology): no. doi:10.1111/j.1467-9868.2012.01033.x.
↑ Storey, John D. (2003). "The positive false discovery rate: A Bayesian interpretation and the q-value" (PDF). Annals of Statistics. 31 (6): 2013–2035. doi:10.1214/aos/1074290335.
↑ Efron, Bradley (2010). Large-Scale Inference. Cambridge University Press. ISBN 978-0-521-19249-1.
↑ Abramovich F, Benjamini Y, Donoho D, Johnstone IM; Benjamini; Donoho; Johnstone (2006). "Adapting to unknown sparsity by controlling the false discovery rate". Annals of Statistics. 34 (2): 584–653. Bibcode:2005math......5374A. arXiv:math/0505374 . doi:10.1214/009053606000000074.
↑ Donoho D, Jin J; Jin (2006). "Asymptotic minimaxity of false discovery rate thresholding for sparse exponential data". Annals of Statistics. 34 (6): 2980–3018. Bibcode:2006math......2311D. arXiv:math/0602311 . doi:10.1214/009053606000000920.
↑ Benjamini Y, Gavrilov Y; Gavrilov (2009). "A simple forward selection procedure based on false discovery rate control". Annals of Applied Statistics. 3 (1): 179–198. Bibcode:2009arXiv0905.2819B. arXiv:0905.2819 . doi:10.1214/08-AOAS194.
↑ Donoho D, Jin JS; Jin (2004). "Higher criticism for detecting sparse heterogeneous mixtures". Annals of Statistics. 32 (3): 962–994. Bibcode:2004math.....10072D. arXiv:math/0410072 . doi:10.1214/009053604000000265.
↑ Colquhoun, David (2015). "An investigation of the false discovery rate and the misinterpretation of p-values". Royal Society Open Science. 1: 140216. doi:10.1098/rsos.140216.
↑ Colquhoun, David. "The problem with p-values". Aeon. Aeon Magazine. Retrieved 11 December 2016.
1 2 Colquhoum, David. "The Reproducibility Of Research And The Misinterpretation Of P Values". bioRxiv. bioRxiv. Retrieved 5 June 2017.

External links

False Discovery Rate Analysis in R – Lists links with popular R packages
Large-scale Simultaneous Inference – Syllabus, notes, and homework from Efron's course at Stanford. Includes PDFs for each chapter of his book.
StatQuest: FDR and the Benjamini-Hochberg Method clearly explained on YouTube

Statistics

Descriptive statistics

Continuous data

Center	Mean arithmetic geometric harmonic Median Mode
Dispersion	Variance Standard deviation Coefficient of variation Percentile Range Interquartile range
Shape	Moments Skewness Kurtosis L-moments

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Population Statistic Effect size Statistical power Sample size determination Missing data
Survey methodology	Sampling stratified cluster Standard error Opinion poll Questionnaire
Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Interaction Factorial experiment
Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F
Goodness of fit	Chi-squared Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra)

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity
Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)
Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.