Image:Anscombe.svg

From Wikipedia, the free encyclopedia

Anscombe.svg (SVG file, nominally 990 × 677 pixels, file size: 88 KB)

Wikimedia Commons logo This is a file from the Wikimedia Commons. The description on its description page there is shown below.
Commons is a freely licensed media file repository. You can help.

[edit] Description

This graphic represents the four datasets defined by Francis Anscombe for which some of the usual statistical properties (mean, variance, correlation and regression line) are the same, even though the datasets are different.

Property Value
Mean of each x variables 9.0
Variance of each x variables 11.0
Mean of each y variables 7.5
Variance of each y variables 4.12
Correlation between each x and y variable 0.816
Regression line y = 3 + 0.5x

The graphic was created by User:Schutz for Wikipedia on 13 June 2006, using the R statistical project. The program that generated the graphic is given below; it is based on the example provided with the help page of the R dataset anscombe (accessible using the command help(anscombe)), and was slightly modified to improve the result. The graph was exported in postscript format, converted to SVG using the pstoedit command, and the layout was slightly modified using Inkscape before upload.

References:

  • Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21.
  • R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. 2006. ISBN 3-900051-07-0. http://www.R-project.org


 postscript("anscombe.ps")
 par(las=1)
 
 
  ##-- some "magic" to do the 4 regressions in a loop:
  ff <- y ~ x
  for(i in 1:4) {
    ff[2:3] <- lapply(paste(c("y","x"), i, sep=""), as.name)
    ## or   ff[[2]] <- as.name(paste("y", i, sep=""))
    ##      ff[[3]] <- as.name(paste("x", i, sep=""))
    assign(paste("lm.",i,sep=""), lmi <- lm(ff, data= anscombe))
  }
  
  ## Now, do what you should have done in the first place: PLOTS
  op <- par(mfrow=c(2,2), mar=1.5+c(4,4,1,1), oma=c(0,0,0,0),
            lab=c(6,6,7), cex.lab=1.5, cex.axis=1.3, mgp=c(3,1,0))
  for(i in 1:4) {
    ff[2:3] <- lapply(paste(c("y","x"), i, sep=""), as.name)
    plot(ff, data =anscombe, col="red", pch=21, bg = "orange", cex = 2.5,
         xlim=c(3,19), ylim=c(3,13))
    abline(get(paste("lm.",i,sep="")), col="blue")
  }
   
  dev.off()
  
This chart was created with R.

[edit] Licensing

The R project is licensed under the GPL [1]; since this image is a derived work, it is also licenced under the GPL.

GNU head This work is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or any later version. This work is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See version 2 and version 3 of the GNU General Public License for more details.

العربية | Català | Česky | Deutsch | Ελληνικά | English | Español | فارسی | Français | Italiano | 日本語 | Nederlands | Polski | Português | Русский | Slovenčina | Svenska | Türkçe | ‪中文(简体)‬ | ‪中文(繁體)‬ | +/-

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeDimensionsUserComment
current00:07, 15 January 2007990×677 (88 KB)Schutz
00:05, 15 January 2007990×677 (88 KB)Schutz
21:37, 13 June 20061,044×750 (94 KB)Schutz
18:27, 13 June 20061,125×875 (94 KB)Schutz
The following pages on the English Wikipedia link to this file (pages on other projects are not listed):