Point-biserial correlation coefficient

From Wikipedia, the free encyclopedia

The point biserial correlation coefficient (rpb) is a correlation coefficient used when one variable (e.g. Y) is dichotomous; Y can either be 'naturally' dichotomous, like gender, or an artificially dichotomized variable. In most situations it is not advisable to artificially dichotomize variables.

The point-biserial correlation is mathematically equivalent to the Pearson (product moment) correlation, that is, if we have one continuously measured variable X and a dichotomous variable Y, rXY = rpb. This can be shown by assigning two distinct numerical values to the dichotomous variable.

To calculate rpb, assume that the dichotomous variable Y has the two values 0 and 1. If we divide the data set into two groups, group 1 which received the value "1" on Y and group 2 which received the value "0" on Y, then the point-biserial correlation coefficient is calculated as follows:

r_{pb} = \frac{M_1 - M_0}{s_x} \sqrt{ \frac{n_1 n_0}{n(n-1)}},

where M1 is the mean value on the continuous variable X for all data points in group 1 and M0 is the mean value on the continuous variable X for all data points in group 2. Further, n1 is the number of data points in group 1, n0 is the number of data points in group 2 and n is the total sample size. This formula is a computational formula that has been derived from the formula for rXY in order to reduce steps in the calculation - it is easier to compute than rXY. It is of much less importance these days since computers are almost exclusively used for statistical data analysis.

[edit] External links

An incorrect formula with n * n instead of n * (n − 1) in the denominator of the square root can be found widely on the internet as well as in the literature, for example in Cohen, J., Cohen P., West, S.G., & Aiken, L.S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences. (3rd ed.) Hillsdale, NJ: Lawrence Erlbaum Associates.

Glass and Hopkins (Statistical Methods in Education and Psychology, 3rd Edition) contains the correct formula.

In other languages