Intentional Programming
From Wikipedia, the free encyclopedia
In computer programming, intentional programming is a collection of concepts which enable software source code to reflect the precise information, called intention, which programmers had in mind when conceiving their work. By closely matching the level of abstraction at which the programmer was thinking, browsing and maintaining computer programs becomes easier.
The concept was introduced by long-time Microsoft employee Charles Simonyi, who led a team in Microsoft Research which developed an Integrated Development Environment called IP that demonstrates these concepts. For reasons that are unclear[1], Microsoft stopped working on intentional programming and ended development of IP in the early 2000s.
Contents |
[edit] Development cycle
As envisioned by Simonyi, developing a new application via the Intentional Programming paradigm proceeds as follows. A programmer first builds a toolbox specific to a given problem domain (such as life insurance). Domain experts, aided by the programmer, then describe the application's intended behavior in a WYSIWYG-like manner. Finally, an automated system uses the program description and the toolbox to generate the final program. Successive changes are done at the WYSIWYG level only, employing a system called the "domain workbench".[2]
[edit] Separating source code storage and presentation
Key to the benefits of IP (and a likely barrier to acceptance from the programming community) is that source code is not stored in text files, but in a proprietary binary file that bears a resemblance to XML. As with XML, there is no need for a specific parser for each piece of code that wishes to operate on the information that forms the program, lowering the barrier to writing analysis or restructuring tools. The video demonstrates how the system handles languages that use a text-based preprocessor by surveying the specific usages of macros in a body of code to invent a more hygienic abstraction.
Tight integration of the editor with the binary format brings some of the nicer features of database normalization to source code. Redundancy is eliminated by giving each definition a unique identity, and storing the name of variables and operators in exactly one place. This means it's also easier to intrinsically distinguish between declarations and references, and the environment shows declarations in boldface type. Whitespace is also not stored as part of the source code, and each programmer working on a project can choose an indentation display of the source that they like. More radical visualizations include showing statement lists as nested boxes, editing conditional expressions as logic gates, or re-rendering names in Chinese.
The project appears to standardize a kind of XML Schema for popular languages like C++ and Java, while letting users of the environment mix and match these with ideas from Eiffel and other languages. Often mentioned in the same context as language-oriented programming via domain-specific languages, and aspect-oriented programming, IP purports to provide some breakthroughs in generative programming. These techniques allow developers to extend the language environment to capture domain-specific constructs without investing in writing a full compiler and editor for any new languages.
[edit] Example
A program that writes out the numbers from 1 to 10, using a curly bracket syntax, might look like this:
for (int i = 1; i <= 10; i++) { System.out.println("the number is " + i); }
The code above contains one of the more common constructs of most computer languages, the bounded loop, in this case represented by the for
construct. The code, when compiled, linked and run, will loop 10 times, incrementing the value of i each time and then printing it out.
But this code does not capture the intentions of the programmer, namely to "print the numbers 1 to 10". In this simple case, a programmer asked to maintain the code could likely figure out what it is intended to do, but this is not always the case. Loops that extend across many lines can become very difficult to understand, notably if the original programmer uses unclear labels. Traditionally the only way to indicate the intention of the code was to add source code comments, but often comments are not added, or are unclear, or drift out of sync with the source code they originally described.
In intentional programming systems the above loop could be represented, at some level, as something as obvious as "print the numbers 1 to 10". The system would then use the intentions to generate source code, likely something very similar to the code above. The key difference is that the intentional programming systems maintain the semantic level, which the source code lacks, and which can dramatically ease readability in larger programs.
Although most languages contain mechanisms for capturing certain kinds of abstraction, IP, like the Lisp family of languages, allows for the addition of entirely new mechanisms. Thus, if a developer started with a language like C, they would be able to extend the language with features such as those in C++ without waiting for the compiler developers to add them. By analogy, many more powerful expression mechanisms could be used by programmers than mere classes and procedures.
[edit] Identity
IP focuses on the concept of identity. Since most programming languages represent the source code as plain text, objects are defined by names, and their uniqueness has to be inferred by the compiler. For example, the same symbolic name may be used to name different variables, procedures, or even types. In code that spans several pages - or, for globally visible names, multiple files - it can become very difficult to tell what symbol refers to what actual object. If a name is changed, the code where it is used must carefully be examined.
By contrast, in an IP system, all definitions not only assign symbolic names, but also unique private identifiers to objects. This means that in the IP development environment, every time you refer to a variable or procedure, it is not just a name - it is a link to the original entity.
The major advantage of this is that if you rename an entity, all of the references to it in your program are changed automatically. This also means that if you use the same name for unique definitions in different namespaces (such as ".to_string()
"), you will not rename the wrong references as is sometimes the case with search/replace in current editors. This feature also makes it easy to have multi-language versions of your program. You can have a set of English names for all your definitions as well as a set of Japanese names which can be swapped in at will.
Having a unique identity for every defined object in the program also makes it easy to perform automated refactoring tasks, as well as simplifying code check-ins in versioning systems. For example, in many current code collaboration systems (i.e. CVS), when two programmers commit changes that conflict (i.e. if one programmer renames a function while another changes one of the lines in that function), the versioning system will think that one programmer created a new function while another modified an old function. In an IP versioning system, it will know that one programmer merely changed a name while another changed the code.
[edit] Levels of detail
IP systems also offer several levels of detail, allowing the programmer to "zoom in" or out. In the example above, the programmer could zoom out to get a level that would say something like:
<<print the numbers 1 to 10>>
Thus IP systems are self-documenting to a large degree, allowing the programmer to keep a good high-level picture of the program as a whole.
[edit] Similar works
There are projects that exploit similar ideas to create code with higher level of abstraction. Among them are:
- Concept programming
- Language-oriented programming
- Semantic-oriented programming
- Literate programming
- MDA
- Software factory
- Metaprogramming
- LISP
[edit] See also
- Programming paradigms
- Code generation
- Object database
- Artefaktur (AAL)
- Abstract syntax tree (AST)
- Language syntax tree (LST)
- Semantic resolution tree (RST)
- Interpretation syntax tree (IST)
- Code generation syntax tree (CST)
[edit] References
- ^ "Simonyi itched to take his idea out of the lab and put it in front of customers, but that was awkward under the circumstances. He explains, "It was impractical, when Microsoft was making tremendous strides with .Net in the near term, to somehow send somebody out from the same organization who says, This is not how you should do things--what if you did things in this other, more disruptive way?""[1] (Quote from "Part II: Anything You Can Do, I Can Do Meta: Space tourist and billionaire programmer Charles Simonyi designed Microsoft Office. Now he wants to reprogram software.", Tuesday, January 09, 2007, Scott Rosenberg, Technology Review.)
- ^ Scott Rosenberg: "Anything You Can Do, I Can Do Meta." Technology Review, 8 January 2007
[edit] External links
- Intentional Software - Charles Simonyi's company
- The Death Of Computer Languages, The Birth of Intentional Programming, a technical report by Charles Simonyi (1995) (FTP links)
- Intentional Programming - Innovation in the Legacy Age, a talk by Charles Simonyi (1996)
- Lutz Roeder, Talks and Papers website
- Citations from CiteSeer
- Edge.org interview with Charles Simonyi (interviewer: John Brockman)
- "Anything You Can Do, I Can Do Meta" Tuesday, January 09, 2007, Scott Rosenberg, Technology Review
- Awaiting the Day When Everyone Writes Software, The New York Times, 28 January 2007
- Is programming a form of encryption?, by Charles Simonyi (2005)
- The information contents of programs, by Charles Simonyi (2005)
- Feature X Considered Harmful, by Charles Simonyi (2005)
- Notations and Programming Languages, by Charles Simonyi (2005)
- Personal Observations from a Developer, by Mark Edel (2005)