Paradigm | object-oriented |
---|---|
Appeared in | Development started in 1969 Publicly available in 1980 |
Designed by | Alan Kay, Dan Ingalls, Adele Goldberg |
Developer | Alan Kay, Dan Ingalls, Adele Goldberg, Ted Kaehler, Scott Wallace, and Xerox PARC |
Typing discipline | dynamic |
Major implementations | Squeak, VisualWorks |
Influenced by | Lisp, Simula |
Influenced | Objective-C, Self, Java, Dylan, AppleScript, Lisaac, NewtonScript, Python, Ruby, Scala, Perl 6 |
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human-computer symbiosis."[1] It was designed and created in part for educational use, more so for constructionist learning, at Xerox PARC by Alan Kay, Dan Ingalls, Adele Goldberg, Ted Kaehler, Scott Wallace, and others during the 1970s, influenced by Lisp, Logo, Sketchpad and Simula.
The language was first generally released as Smalltalk-80 and has been widely used since. Smalltalk-like languages are in continuing active development, and have gathered loyal communities of users around them. ANSI Smalltalk was ratified in 1998 and represents the standard version of Smalltalk.
Contents |
There are a large number of Smalltalk and variants, as there are with many computer languages.[2] The unqualified word Smalltalk is often used to indicate the Smalltalk-80 language, the first version to be made publicly available and created in 1980.
Smalltalk was the product of research by a group of researchers led by Alan Kay at Xerox Palo Alto Research Center (PARC); Alan Kay designed most of the early Smalltalk versions, which Dan Ingalls implemented. The first version, known as Smalltalk-71, was created in a few mornings on a bet that a programming language based on the idea of message passing inspired by Simula could be implemented in "a page of code."[1] A later variant actually used for research work is now known as Smalltalk-72 and influenced the development of the Actor model. Its syntax and execution model were very different from modern Smalltalk variants.
After significant revisions which froze some aspects of execution semantics to gain performance (by adopting a Simula-like class inheritance model of execution), Smalltalk-76 was created. This system had a development environment featuring most of the tools now familiar including a class library code browser/editor. Smalltalk-80 added metaclasses, to help maintain the "everything is an object" (except variables) paradigm by associating properties and behavior with individual classes, and even primitives such as integer and boolean values (for example, to support different ways of creating instances).
Smalltalk-80 was the first language variant made available outside of PARC, first as Smalltalk-80 Version 1, given to a small number of companies (Hewlett-Packard, Apple Computer, Tektronix, and DEC) and universities (UC Berkeley) for "peer review" and implementation on their platforms. Later (in 1983) a general availability implementation, known as Smalltalk-80 Version 2, was released as an image (platform-independent file with object definitions) and a virtual machine specification. ANSI Smalltalk has been the standard language reference since 1998.[3]
Two of the currently popular Smalltalk implementation variants are descendants of those original Smalltalk-80 images. Squeak is an open source implementation derived from Smalltalk-80 Version 1 by way of Apple Smalltalk. VisualWorks is derived from Smalltalk-80 version 2 by way of Smalltalk-80 2.5 and ObjectWorks (both products of ParcPlace Systems, a Xerox PARC spin-off company formed to bring Smalltalk to the market). As an interesting link between generations, in 2002 Vassili Bykov implemented Hobbes, a virtual machine running Smalltalk-80 inside VisualWorks.[4] (Dan Ingalls later ported Hobbes to Squeak).
During the late 1980s to mid-1990s, Smalltalk environments — including support, training and add-ons — were sold by two competing organizations: ParcPlace Systems and Digitalk, both California based. ParcPlace Systems tended to focus on the Unix/Sun Microsystems market, while Digitalk emphasized Intel-based PCs that were running either Microsoft Windows or IBM's OS/2. Both companies, however, struggled to take Smalltalk mainstream due to Smalltalk's substantial memory footprint, limited run-time performance, and initially lack of supported connectivity to SQL-based relational database servers. While the high price point of the ParcPlace Smalltalk limited its market penetration to mid-sized and large commercial organizations, the Digitalk products initially tried to reach a wider audience with a lower price point. IBM having initially supported the Digitalk product entered the market with a Smalltalk product in 1995 called VisualAge/Smalltalk. Easel introduced Enfin at this time on Windows and OS/2. Enfin became much more popular in Europe, as IBM introduced it into IT shops prior to their development of IBM Smalltalk (later VisualAge). Enfin was later acquired by Cincom Systems, and is now sold under the name ObjectStudio, and is part of the Cincom Smalltalk product suite.
In 1995, ParcPlace and Digitalk merged into ParcPlace-Digitalk and then rebranded in 1997 as ObjectShare, located in Irvine, CA. ObjectShare (NASDAQ: OBJS) was traded publicly until 1999, when it was delisted and dissolved. The merged company never managed to find an effective response to Java in terms of market positions and by 1997 its owners were looking to sell the business. In 1999, Seagull Software acquired the Java development lab of ObjectShare (original Smalltalk/V, Visual Smalltalk development team), and still owns VisualSmalltalk, although worldwide distribution rights for the Smalltalk product remained with ObjectShare.[5] VisualWorks was sold to Cincom and is now part of Cincom Smalltalk. Cincom has backed Smalltalk quite strongly, putting out multiple new releases of VisualWorks and ObjectStudio each year since 1999.
Cincom, Gemstone and Object Arts, plus other vendors continue to sell Smalltalk environments, IBM has 'end of life'd VisualAge Smalltalk having in the late 1990s decided to back Java and it is, as of 2006, supported by Instantiations, Inc. which has renamed the product VASmalltalk and released a new version. The open Squeak implementation has an active community of developers, including many of the original Smalltalk community, and has recently been used to provide the Etoys environment on the OLPC project and the virtual worlds environment Croquet Project. GNU Smalltalk is a free software implementation of a derivative of Smalltalk-80 from the GNU project.
A significant development, that has spread across all current Smalltalk environments, is the adoption of the Seaside web framework that simplifies the building of complex web applications.
Smalltalk has influenced the wider world of computer programming in four main areas. It inspired the syntax and semantics of other computer programming languages. Secondly, it was a prototype for a model of computation known as message passing. Thirdly, its WIMP GUI inspired the windowing environments of personal computers in the late twentieth and early twenty-first centuries, so much so that the windows of the first Macintosh desktop look almost identical to the MVC windows of Smalltalk-80. Finally, the integrated development environment was the model for a generation of visual programming tools that look like Smalltalk's code browsers and debuggers.
Smalltalk laid the groundwork and proved all the principles that led to the development and commercial success of Java.
Python and Ruby have reimplemented some Smalltalk ideas with more C/Java-like syntax. The Smalltalk "metamodel" also serves as the inspiration for the object model design for Perl 6.
There is also a modular Smalltalk-like implementation designed for scripting called S# (S-Sharp). S# uses just-in-time compilation technology and supports an extended Smalltalk-like language written by David Simmons of Smallscript Corp.[6][7]
As in other object-oriented languages, the central concept in Smalltalk-80 (but not in Smalltalk-72) is that of an object. An object is always an instance of a class. Classes are "blueprints" that describe the properties and behavior of their instances. For example, a Window class would declare that windows have properties such as the label, the position and whether the window is visible or not. The class would also declare that instances support operations such as opening, closing, moving and hiding. Each particular Window object would have its own values of those properties, and each of them would be able to perform operations defined by its class.
A Smalltalk object can do exactly three things:
The state an object holds is always private to that object. Other objects can query or change that state only by sending requests (messages) to the object to do so. Any message can be sent to any object: when a message is received, the receiver determines whether that message is appropriate. (Alan Kay has commented that despite the attention given to objects, messaging is the most important concept in Smalltalk.)
Smalltalk is a 'pure' OO language, meaning that, unlike Java and C++, there is no difference between values which are objects and values which are primitive types. In Smalltalk, primitive values such as integers, booleans and characters are also objects, in the sense that they are instances of corresponding classes, and operations on them are invoked by sending messages. A programmer can change the classes that implement primitive values, so that new behavior can be defined for their instances--for example, to implement new control structures--or even so that their existing behavior will be changed. This fact is summarised in the commonly heard phrase "In Smalltalk everything is an object" (which would more accurately be expressed as "all values are objects", as variables aren't).
Since all values are objects, classes themselves are also objects. Each class is an instance of the metaclass of that class. Metaclasses in turn are also objects, and are all instances of a class called Metaclass. Code blocks are also objects.
Smalltalk-80 is a totally reflective system, implemented in Smalltalk-80 itself. Smalltalk-80 provides both structural and computational reflection. Smalltalk is a structurally reflective system whose structure is defined by Smalltalk-80 objects. The classes and methods that define the system are themselves objects and fully part of the system that they help define. The Smalltalk compiler compiles textual source code into method objects, typically instances of CompiledMethod
. These get added to classes by storing them in a class's method dictionary. The part of the class hierarchy that defines classes can add new classes to the system. The system is extended by running Smalltalk-80 code that creates or defines classes and methods. In this way a Smalltalk-80 system is a "living" system, carrying around the ability to extend itself at run time.
Since the classes are themselves objects, they can be asked questions such as "what methods do you implement?" or "what fields/slots/instance variables do you define?". So objects can easily be inspected, copied, (de)serialized and so on with generic code that applies to any object in the system.
Smalltalk-80 also provides computational reflection, the ability to observe the computational state of the system. In languages derived from the original Smalltalk-80 the current activation of a method is accessible as an object named via a keyword, thisContext
. By sending messages to thisContext
a method activation can ask questions like "who sent this message to me". These facilities make it possible to implement co-routines or Prolog-like back-tracking without modifying the virtual machine. One of the more interesting uses of this is in the Seaside web framework which relieves the programmer of dealing with the complexity of a Web Browser's back button by storing continuations for each edited page and switching between them as the user navigates a web site. Programming the web server using Seaside can then be done using a more conventional programming style.
When an object is sent a message that it does not implement, the virtual machine sends the object the doesNotUnderstand:
message with a reification of the message as an argument. The message (another object, an instance of Message
) contains the selector of the message and an Array
of its arguments. In an interactive Smalltalk system the default implementation of doesNotUnderstand:
is one that opens an error window reporting the error to the user. Through this and the reflective facilities the user can examine the context in which the error occurred, redefine the offending code, and continue, all within the system, using Smalltalk-80's reflective facilities.
Yet another important use of doesNotUnderstand:
is intercession. One can create a class that does not define any methods other than doesNotUnderstand:
and does not inherit from any other class. The instances of this class effectively understand no messages. So every time a message is sent to these instances they actually get sent doesNotUnderstand:
, hence they intercede in the message sending process. Such objects are called proxies. By implementing doesNotUnderstand:
appropriately, one can create distributed systems where proxies forward messages across a network to other Smalltalk systems (a facility common in systems like CORBA, COM+ and RMI but first pioneered in Smalltalk-80 in the 1980s), and persistent systems where changes in state are written to a database and the like. An example of this latter is Logic Arts' VOSS (Virtual Object Storage System) available for VA Smalltalk under dual open source and commercial licensing.
Smalltalk-80 syntax is rather minimalist, based on only a handful of declarations and reserved words. In fact, only six keywords are reserved in Smalltalk: true, false, nil, self, super and thisContext. (These are not actually keywords, but pseudo-variables. The true, false, and nil pseudo-variables are singleton instances. Smalltalk does not really define any keywords.) The only built-in language constructs are message sends, assignment, method return and literal syntax for some objects. From its origins as a teaching language, standard Smalltalk syntax uses punctuation in a manner more like English than mainstream coding languages. The remainder of the language, including control structures for conditional evaluation and iteration, is implemented on top of the built-in constructs by the standard Smalltalk class library. (For performance reasons implementations may recognize and treat as special some of those messages; however, this is only an optimization, not hardwired into the language syntax).
The following examples illustrate the most common objects which can be written as literal values in Smalltalk-80 methods.
Numbers. The following list illustrates some of the possibilities.
42 -42 123.45 1.2345e2 2r10010010 16rA000
The last two entries are a binary and a hexadecimal number, respectively. The number before the 'r' is the radix or base. The base does not have to be a power of two; for example 36rSMALLTALK is a valid number (for the curious, equal to 80738163270632 decimal).
Characters are written by preceding them with a dollar sign:
$A
Strings are sequences of characters enclosed in single quotes:
'Hello, world!'
To include a quote in a string, escape it using a second quote:
'I said, ''Hello, world!'' to them.'
Double quotes do not need escaping, since single quotes delimit a string:
'I said, "Hello, world!" to them.'
Two equal strings (strings are equal if they contain all the same characters) can be different objects residing in different places in memory. In addition to strings, Smalltalk has a class of character sequence objects called Symbol. Symbols are guaranteed to be unique--there can be no two equal symbols which are different objects. Because of that, symbols are very cheap to compare and are often used for language artifacts such as message selectors (see below).
Symbols are written as # followed by characters. For example:
#foo
Arrays:
#(1 2 3 4)
defines an array of four integers.
And last but not least, blocks (anonymous function literals)
[... Some smalltalk code...]
Blocks are explained in detail further in the text.
Many Smalltalk dialects implement additional syntaxes for other objects, but the ones above are the bread and butter supported by all.
The two kinds of variable commonly used in Smalltalk are instance variables and temporary variables. Other variables and related terminology depend on the particular implementation. For example, VisualWorks has class shared variables and namespace shared variables, while Squeak and many other implementations have class variables, pool variables and global variables.
Temporary variable declarations in Smalltalk are variables declared inside a method (see below). They are declared at the top of the method as names separated by spaces and enclosed by vertical bars. For example:
| index |
declares a temporary variable named index. Multiple variables may be declared within one set of bars:
| index vowels |
declares two variables: index and vowels.
A variable is assigned a value via the ':=' syntax. So:
vowels := 'aeiou'
Assigns the string 'aeiou' to the previously declared vowels variable. The string is an object (a sequence of characters between single quotes is the syntax for literal strings), created by the compiler at compile time.
In the original Parc Place image, the glyph of the underscore character (_) appeared as a left-facing arrow. Smalltalk originally accepted this left-arrow as the only assignment operator. Some modern code still contains what appear to be underscores acting as assignments, harking back to this original usage. Most modern Smalltalk implementations accept either the underscore or the colon-equals syntax.
The message is the most fundamental language construct in Smalltalk. Even control structures are implemented as message sends. Smalltalk adopts by default a synchronous, single dynamic message dispatch strategy (as contrasted to a synchronous, multiple dispatch strategy adopted by some other object-oriented languages).
The following example sends the message 'factorial' to number 42:
42 factorial
In this situation 42 is called the message receiver, while 'factorial' is the message selector. The receiver responds to the message by returning a value (presumably in this case a factorial of 42). Among other things, the result of the message can be assigned to a variable:
aRatherBigNumber := 42 factorial
"factorial" above is what is called a unary message because only one object, the receiver, is involved. Messages can carry additional objects as arguments, as follows:
2 raisedTo: 4
In this expression two objects are involved: 2 as the receiver and 4 as the message argument. The message result, or in Smalltalk parlance, the answer is supposed to be 16. Such messages are called keyword messages. A message can have more arguments, using the following syntax:
'hello world' indexOf: $o startingAt: 6
which answers the index of character 'o' in the receiver string, starting the search from index 6. The selector of this message is "indexOf:startingAt:", consisting of two pieces, or keywords.
Such interleaving of keywords and arguments greatly improves readability of code, since arguments are explained by their preceding keywords. For example, an expression to create a rectangle using a C++ or Java-like syntax might be written as:
new Rectangle(100, 200);
It's unclear which argument is which—is the argument order (width, height) or (height, width)? In Java, you have to look up the API online to find out that the class's argument-order is, in fact, (width, height). By contrast, in Smalltalk, this code would be written unambiguously as:
Rectangle width: 100 height: 200
The receiver in this case is "Rectangle", a class, and the answer will be a new instance of the class with the specified width and height.
Finally, most of the special (non-alphabetic) characters can be used as what are called binary messages. These allow mathematical and logical operators to be written in their traditional form:
3 + 4
which sends the message "+" to the receiver 3 with 4 passed as the argument (the answer of which will be 7). Similarly,
3 > 4
is the message ">" sent to 3 with argument 4 (the answer of which will be false).
Notice, that the Smalltalk-80 language itself does not imply the meaning of those operators. The outcome of the above is only defined by how the receiver of the message (in this case a Number instance) responds to messages "+" and ">".
A side effect of this mechanism is operator overloading. A message ">" can also be understood by other objects, allowing the use of expressions of the form "a > b" to compare them.
An expression can include multiple message sends. In this case expressions are parsed according to a simple order of precedence. Unary messages have the highest precedence, followed by binary messages, followed by keyword messages. For example:
3 factorial + 4 factorial between: 10 and: 100
is evaluated as follows:
The answer of the last message send is the result of the entire expression.
Parentheses can alter the order of evaluation when needed. For example,
(3 factorial + 4) factorial between: 10 and: 100
will change the meaning so that the expression first computes "3 factorial + 4" yielding 10. That 10 then receives the second "factorial" message, yielding 3628800. 3628800 then receives "between:and:", answering false.
Note that because the meaning of binary messages is not hardwired into Smalltalk-80 syntax, all of them are considered to have equal precedence and are evaluated simply from left to right. Because of this, the meaning of Smalltalk expressions using binary messages can be different from their "traditional" interpretation:
3 + 4 * 5
is evaluated as "(3 + 4) * 5", producing 35.
Unary messages can be chained by writing them one after another:
3 factorial factorial log
which sends "factorial" to 3, then "factorial" to the result (6), then "log" to the result (720), producing the result 2.85733.
A series of expressions can be written as in the following (hypothetical) example, each ending with a period. This example first creates a new instance of class Window, stores it in a variable, and then sends two messages to it.
| window | window := Window new. window label: 'Hello'. window open.
If a series of messages are sent to the same receiver as in the example above, they can also be written as a cascade with individual messages separated by semicolons:
(Window new) label: 'Hello'; open
This rewrite of the earlier example as a single expression avoids the need to store the new window in a temporary variable. According to the usual precedence rules, the unary message "new" is sent first, and then "label:" and "open" are sent to the answer of "new".
A block of code (an anonymous function) can be expressed as a literal value (which is an object, since all values are objects.) This is achieved with square brackets:
[ :params | <message-expressions> ]
Where :params is the list of parameters the code can take. This means that the Smalltalk code:
[:x | x + 1]
can be understood as:
f(x) = x + 1
(or, expressed using lambda calculus):
λx.(x+1)
and
[:x | x + 1] value: 3
can be evaluated as
f(3) = 3 + 1
The resulting block object is a closure. It can (at any time) access the variables of its enclosing lexical scopes. Blocks are first class objects. That is, references to blocks can be passed as arguments, returned as values, or stored as a state, just like any other objects. Blocks can be asked to execute their code by sending them a "value"-message (with one argument for each parameter in the block).
The literal representation of blocks was an innovation which allowed certain code to be significantly more readable; it allowed algorithms involving iteration to be coded in a clear and concise way. Code that would typically be written with loops in some languages can be written concisely in Smalltalk using blocks, sometimes in a single line.
positiveAmounts := allAmounts select: [:amt | amt isPositive]
Note that this is related to functional programming, wherein patterns of computation (here selection) are abstracted into higher-order functions. For example, the message select: on a Collection is equivalent to the higher-order function filter on an appropriate functor.
Control structures do not have special syntax in Smalltalk. They are instead implemented as messages sent to objects. For example, conditional execution is implemented by sending the message ifTrue: to a Boolean object, passing as an argument the block of code to be executed if and only if the Boolean receiver is true.
The following code demonstrates this:
result := a > b ifTrue:[ 'greater' ] ifFalse:[ 'less' ]
Blocks are also used to implement user-defined control structures, enumerators, visitors, pluggable behavior and many other patterns. For example:
| aString vowels | aString := 'This is a string'. vowels := aString select: [:aCharacter | aCharacter isVowel].
In the last line, the string is sent the message select: with an argument that is a code block literal. The code block literal will be used as a predicate function that should answer true if and only if an element of the String should be included in the Collection of characters that satisfy the test represented by the code block that is the argument to the "select:" message.
A String object responds to the "select:" message by iterating through its members (by sending itself the message "do:"), evaluating the selection block ("aBlock") once with each character it contains as the argument. When evaluated (by being sent the message "value: each"), the selection block (referenced by the parameter "aBlock", and defined by the block literal "[:aCharacter | aCharacter isVowel]"), answers a boolean, which is then sent "ifTrue:". If the boolean is the object true, the character is added to a string to be returned. Because the "select:" method is defined in the abstract class Collection, it can also be used like this:
| rectangles aPoint| rectangles := OrderedCollection with: (Rectangle left: 0 right: 10 top: 100 bottom: 200) with: (Rectangle left: 10 right: 10 top: 110 bottom: 210). aPoint := Point x: 20 y: 20. collisions := rectangles select: [:aRect | aRect containsPoint: aPoint].
This is a stock class definition:
Object subclass: #MessagePublisher instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Smalltalk Examples'
Often, most of this definition will be filled in by the environment. Notice that this is actually a message to the "Object"-class to create a subclass called "MessagePublisher". In other words: classes are first-class objects in Smalltalk which can receive messages just like any other object and can be created dynamically at execution time.
When an object receives a message, a method matching the message name is invoked. The following code defines a method publish, and so defines what will happen when this object receives the 'publish' message.
publish Transcript show: 'Hello, World!'
The following method demonstrates receiving multiple arguments and returning a value:
quadMultiply: i1 and: i2 "This method multiplies the given numbers by each other and the result by 4." | mul | mul := i1 * i2. ^mul * 4
The method's name is #quadMultiply:and:. The documentation is represented by a string, making it accessible from the program. Return value is specified with the ^ operator.
Note that objects are responsible for determining dynamically at runtime which method to execute in response to a message--while in many languages this may be (sometimes, or even always) determined statically at compile time.
The following code:
MessagePublisher new
creates (and returns) a new instance of the MessagePublisher class. This is typically assigned to a variable:
publisher := MessagePublisher new
However, it is also possible to send a message to a temporary, anonymous object:
MessagePublisher new publish
In the following code, the message "show:" is sent to the object "Transcript" with the String literal 'Hello, world!' as its argument. Invocation of the "show:" method causes the characters of its argument (the String literal 'Hello, world!') to be displayed in the transcript ("terminal") window.
Transcript show: 'Hello, world!'.
Note that a Transcript window would need to be open in order to see the results of this example.
Most popular programming systems separate program code (in the form of class definitions, functions or procedures) from program state (such as objects or other forms of application data). They load the program code when an application is started, and any previous application state has to be recreated explicitly from configuration files or other data sources. Any settings the application programmer doesn't explicitly save, you have to set back up whenever you restart. A traditional application also throws away a lot of useful document information every time you save a file, quit and reload. You lose details such as undo history or cursor position. Image based systems don't force you to scrap all that just because you need to turn off your computer, or update the OS.
Many Smalltalk systems, however, do not differentiate between application data (objects) and code (classes). In fact, classes are objects themselves. Therefore most Smalltalk systems store the entire application state (including both Class and non-Class objects) in an image file. The image can then be loaded by the Smalltalk virtual machine to restore a Smalltalk-like system to a previous state. This was inspired by FLEX, a language created by Alan Kay and described in his M.Sc. thesis.
Other languages that model application code as a form of data, such as Lisp, often use image-based persistence as well.
Smalltalk images are similar to (restartable) core dumps and can provide the same functionality as core dumps, such as delayed or remote debugging with full access to the application state at the time of error.
Everything in Smalltalk-80 is available for modification from within a running program. This means that, for example, the IDE can be changed in a running system without restarting it. In some implementations, the syntax of the language or the garbage collection implementation can also be changed on the fly. Even the statement true become: false
is valid in Smalltalk, although executing it is not recommended.
Smalltalk programs are usually compiled to bytecode, which is then interpreted by a virtual machine or dynamically translated into machine-native code.