Stata

From Wikipedia, the free encyclopedia

Stata
Stata on Windows
Stata 10.0 on Windows
Developed by Statacorp
Latest release 10.0 / Summer 2007
OS Windows, Mac OS X, Unix, Linux
Genre statistical analysis
License proprietary
Website www.stata.com

Stata is a general-purpose statistical software package created in 1985 by StataCorp. It is used by many businesses and academic institutions around the world. Most of its users work in research, especially in the fields of economics, sociology, political science, and epidemiology.

Stata's full range of capabilities includes:

  • Data management
  • Statistical analysis
  • Graphics
  • Simulations
  • Custom programming

The name "Stata" was formed by blending "statistics" and "data"; it is not an acronym.

Contents

[edit] User interface

Since version 8.0, Stata has included a graphical user interface which uses menus and dialog boxes to give access to nearly all built-in commands. This generates code which is always displayed, easing the transition to the command line interface and more flexible scripting language. The data set can be viewed or edited in spreadsheet format, but this must be closed before other commands are executed.

[edit] Data structure and storage

Stata can only open a single data set at any one time. Stata holds the entire data set in (random-access or virtual) memory, which limits its use with extremely large data sets. This is mitigated to some extent by efficient internal storage, as there are integer storage types which occupy only one or two bytes rather than four, and single-precision (4 bytes) rather than double-precision (8 bytes) is the default for floating-point numbers.

The data set is always rectangular in format, that is, all variables hold the same number of observations (in more mathematical terms, all vectors have the same length, although some entries may be missing values).

[edit] Extensibility

Stata is unusual among commercial statistics packages in allowing user-written commands, distributed as so called ado-files, to be straightforwardly downloaded from the internet which are then indistinguishable to the user from the built-in commands. In this respect, Stata combines the extensibility more often associated with open-source packages with features usually associated with commercial packages such as software verification, technical support and professional documentation. Some user-written commands have later been adopted by StataCorp to become part of a subsequent official release after appropriate checking, certification and documentation.

[edit] User community

Stata has an active email list (Statalist, over 1000 messages per month), to which StataCorp employees regularly contribute. Statalist is maintained by Marcello Pagano, Harvard School of Public Health not by StataCorp itself. Articles about the use of Stata and new user-written commands are published in the quarterly peer-reviewed Stata Journal. User group meetings are held annually in the USA, the UK, Germany and Italy, and less frequently in several other countries.

[edit] Example Stata code

To perform logistic regression of y on x:

logistic y x

To display a scatter plot of y against x restricted to values of x below 10:

scatter y x if x < 10

[edit] See also

[edit] External links