Euphoria (programming language)

From Wikipedia, the free encyclopedia

Euphoria
Developer: Robert Craig
Latest release: 3.0.2 / February 9, 2007
OS: Cross-platform DOS32, WIN32, Linux, FreeBSD
Use: Interpreted language
License: License for version 3.0.2
Website: www.rapideuphoria.com

Euphoria is an interpreted programming language conceived and created by Robert Craig of Rapid Deployment Software.

Contents

[edit] Introduction

Developed as a personal project to invent a programming language from scratch, Euphoria's first incarnation was created by Robert Craig on an Atari Mega-ST. As of the release of version 3.0.0 (October 17, 2006), it is Open Source Software.

It was developed with the following design goals in mind:

  1. Simplicity - To be easier to learn and use than BASIC, with more-consistent high-level constructs. Uses flat-form 32-bit memory to avoid complicated memory management and size/addressing limits.
  2. Power - To provide low-level capabilities needed to access the OS and BIOS for professional development, but be more structured and less terse than a low-level language, making low-level programming less dangerous.
  3. Safety - Extensive debugging support and run-time error-handling; automatic subscript checking, type-checking, and memory handling.
  4. Flexibility - User-defined type support, with variables as loosely or strictly typed as desired. Object-oriented programming can be accomplished by defining objects as types (subsets of the sequence, which is a general-purpose collection).
  5. Ease of Development - Interpreted, with automatic memory management and garbage collection.
  6. Speed - To be fast enough to rival compiled languages for usefulness, despite being an interpreted language

The name "Euphoria" itself is an acronym for "End-User Programming with Hierarchical Objects for Robust Interpreted Applications", although there is some suspicion that this is in fact a backronym.

The first world-visible incarnation of the language was for the 32-bit DOS platform and was released in July of 1993. The original Atari version, to date, has not been released.

Current versions support 32-bit DOS, Windows, Linux, and FreeBSD. There is also a translator to convert Euphoria code into C for compilation to native machine code and what is known as the Binder, which binds the Euphoria source code to the interpreter to make an executable instead of machine compiling.

With the release of version 2.5 the Euphoria interpreter was split into two sections: the front-end parser and the back-end interpreter. The front-end is now written in Euphoria instead of C and was released as open source. The front-end is also used with the Euphoria-to-C translator and the Binder.

[edit] Use

Euphoria was primarily used by hobbyists for utility and computer game programming, but has proven useful for fairly diverse purposes. The primary strength seems to be the ease of handling dynamic collections of data of various types, most useful when dealing with string processing and image processing, which can be quite difficult in many languages. It has been used in artificial intelligence experiments, the study of mathematics, for teaching programming, and to implement fonts involving thousands of characters. Also, Euphoria has been proven to be a useful CGI programming language: the File Archive Search is written in Euphoria, for example.

Euphoria source code can be "bound" to the Euphoria run-time code to produce a stand-alone program for distribution. The code may also be "shrouded" to prevent others from viewing, copying, or changing the source.

You can also use the Euphoria-to-C translator to convert your Euphoria source code into C source code and then compile it into machine language. Using this technique, you can create stand-alone programs as well as Windows DLL files.

[edit] Data types

Euphoria has just two basic data types:

atom 
These are numbers, implemented as either 31-bit integer or 64-bit IEEE floating-point, depending on the current value. Euphoria dynamically changes the implementation to the most efficient one for the data item's current value.
sequence 
Vectors which can have zero or more elements; each element is either an atom or a sequence. The number of elements in a sequence is not fixed; the coder can add or remove elements as required during run-time. Euphoria automatically handles the allocation and deallocation of RAM, and the automatic garbage collection for you. Individual elements are referenced using an index value enclosed in square brackets. The first element in a sequence has an index of one [1]. Elements inside embedded sequences are referenced by additional bracked index values, thus X[3][2] refers to the second element contained in the sequence that is the third element of X.

Additionally, Euphoria has two specialized data types:

integer 
A special form of atom, restricted to 31-bit integer values in the range -1073741824 to 1073741823. Integer data types are more efficient than the atom data types, but cannot contain the same range of values. Characters are stored as integers, eg coding ASCII-'A' is exactly the same as coding 65.
object 
A generic datatype that can contain any of the above, and can be changed during run-time. This means that if you have an object called X that is assigned the value 3.172, then later on you can assign it the value "ABC". Note that in fact, each element of a sequence is actually an object.

There is no character string data type, as these are represented by a sequence of integer values. However, because literal strings are so commonly used in programming, Euphoria interprets double-quote enclosed characters as a sequence of integers. Thus

"ABC"

is seen as if the coder had written:

{'A', 'B', 'C'}

which is the same as:

{65,66,67}

[edit] Hello World

 puts(1,"Hello World!\n")

[edit] Examples

Note: Code comments start with a double dash "--" and go through the end of line. There are no multi-line comments.

As brief examples, the following code

global function delete_item( object old, sequence group )
   integer pos
             -- Code begins --
   pos = find( old, group )
   if pos > 0 then
       group = group[1 .. pos-1] & group[pos+1 .. length( group )]
   end if
   return group
end function

looks for an old item in a group of items. If found, it removes it by concatenating all the elements prior to it with all the elements after it. The result is then returned. Note that elements in sequences are 1-based indexed. This means that the first element has an index of 1.

Simplicity is apparent in that the code clearly delineates its constructs with words. Instead of braces, semicolons, and question marks, you see phrases like 'if..then', 'end if', and 'end function'.

Flexibility is present; the item 'old' could be strings, numbers, images, or whole collections of data themselves. A different function for each data type isn't needed, nor does the programmer have to check the data types. This function will work with any sequence of data of any type, and requires no external libraries.

global function replace_item( object old, object new, sequence group )
   integer pos
             -- Code begins --
   pos = find( old, group )
   if pos > 0 then
       group[pos] = new
   end if
   return group
end function

Safety is present due to the fact that there are no pointers involved and subscripts are automatically checked. Thus the function cannot access memory out-of-bounds, and cannot go beyond the end of the sequence or before the beginning of it to corrupt the memory. There is no need to explicitly allocate or deallocate memory, and no chance of a leak.

The line

group = group[1 .. pos-1] & group[pos+1 .. length( group )]

shows some of the sequence handling facilities. A sequence can contain a collection of any types, and this can be sliced (to take a subset of the data in a sequence) and concatenated in expressions, with no need for special functions.

Version 2.5 introduces the new '$' symbol, which is used for "length(sequence)." So, the above example could be written in 2.5 as follows:

group = group[1 .. pos-1] & group[pos+1 .. $]

[edit] Parameter passing

Another feature is that all arguments to routines are always passed by value. There is no pass-by-reference facility. This is implemented in a very efficient manner as sequences automatically have copy-on-write semantics. In other words, when you pass a sequence to a routine, initially only a reference to it passed but at the point that the routine first modifies a sequence parameter, the sequence is copied and the routine updates a copy of the original.

[edit] Comparisons

[edit] External links

Free downloads of Euphoria for the various platforms, packages, Windows IDE, Windows API libraries, a GTK+ wrapper for Linux, graphics libraries (DOS, OpenGL, etc).

[edit] Commercial Use of Euphoria