Talk:Generator (computer science)

From Wikipedia, the free encyclopedia

Contents

[edit] Origin of the concept

The article claims that generator were invented in Sather, but they date back significantly further. Jeremy Hylton gives a short historical overview. I could track them down as far as:

I couldn't find enough information about Alphard and IPL-v to know how they compare to later conceptions of generators. --Piet Delport 13:55, 7 November 2005 (UTC)

Changed article to say CLU instead of Sather. CLU iterators are clearly generators. Not sure about Alphard. From Ms Liskov's article it sounds like Alphard didn't have yield. But regardless, the new page is at least marginally more accurate.  ;) Jorend 23:01, 5 January 2006 (UTC)

[edit] Generators in Icon

Icon has had generators since I believe it's initial public implementation. The 1st edition of the book "Icon Programming Language" by Griswold and Griswold, Prentice-Hall 1983, states in the preface.

"... It is worth noting here that a number of other programming languages have constructions called generators. CLU [5] is an example. In most languages, however, generators are confined to particular operations on certain kinds of values. In Icon, generators are completely general and may occur in any computation. Generators in themselves lend conciseness and expressive power. ... CheyenneWills 21:17, 12 December 2005 (UTC)

If i'm not mistaken, Icon was quite influential in terms of popularizing generators. If so, it definitely deserves a good mention in the article. Tangentially: i get the impression that Icon's generator support goes hand in hand with its backtracking support; how much does this influence the "flavor" of Icon's generator use, compared to other languages that don't natively support backtracking? --Piet Delport 13:35, 13 December 2005 (UTC)
I believe that you are correct. As for the influence of the "flavor", I believe that the use of generators in Icon can be broken down into 2 main areas, the first is the pure generation of "things", such as all the elements of a list
every write(!biglist)
The second is within the confines of backtracking where alternatives are required. (here the find function is a generator). In this situation, string pattern matching relies heavily on generators
if find("i",string) = (3 | 10) then ...
One of the other interesting areas is preserving the state of a generator by using Icon's coexpressions. The usual example is generating labels.
label := create "L" || seq()
...
newlabel := @label
nextlabel := @label
...
where newlabel would be assigned L1 and nextlabel would get L2 (assuming the first and second invocations of the coexpression). CheyenneWills 18:53, 13 December 2005 (UTC)

[edit] Other languages implementing generators

(excluding languages which can implement generator functionality (via more general support for continuations or coroutines), but don't really support them as a native idiom)

[edit] Generators versus continuations

I've backed out this recent edit that introduces the wording "[Generators can be thought of as containing] a continuation (a programming language structure which can be thought of as a "frozen stack frame")". It was well-intended (thanks!), but a bit misleading, as generators and continuations are quite different. Tim Peters wrote a most excellent impromptu essay on generators, coroutines and continuations, and how they compare, but i'll attempt to summarize here:

  • While a generator contains a single stack frame, a continuation contains a complete call stack.
  • Invoking a generator resumes it on top of its caller's stack, and suspending/returning from the generator resumes the caller again (working almost exactly like normal function calls, in both cases (which is why generators are often explained as "resumable functions")). Invoking a continuation, on the other hand, is like discarding the old call stack entirely, and replacing it with the one stored in the continuation (with no way to return to the "caller", unless its stack was previously saved as another continuation).

--Piet Delport 16:54, 12 February 2006 (UTC)

[edit] Poor choice of example code

The generator code is recursive!!! In order to use the example to confirm my understanding of how a generator works is correct, I must first correctly understand ... generators! Clearly, that's a problem!

Could someone please provide a non-recursive example of the code, preferablely with a comment that describes what the expected mathematical outcome should be (ie. which mathematical "sum" is being calculated?). In that way, the reader can confirm for themselves that their understanding of the code matches the author's intent; and that their understanding of generators is therefore correct.

Right now, I *think* I understand what the code does, but since it doesn't include comments and I don't have a python compiler in front of me, I can't be sure that I've got it right... —Preceding unsigned comment added by 216.254.142.195 (talkcontribs) 2006-06-29 19:13

I agree, the example is nearly uselessly opaque. I replaced it with an infinite incrementing integer sequence generator (it's not as scary as its name :), which hopefully neatly illustrates generators' laziness and statefulness. What do you think? --Piet Delport 14:13, 30 June 2006 (UTC)
That's much better! Thank you! :-)

Just to provide another example in Icon... (using the same "logic" as the python code)..

procedure countfrom(n)
repeat {
    suspend n
    n +:= 1
}
end

# Example use: printing out the integers from 10 to 20.
# Note that this iteration terminates normally, despite countfrom() being
# written as an infinite loop.

every i :=  countfrom(10) do {
    if i <= 20 then
        write(i)
    else
        break
}

Another popular example is a generator for the Fibonacci sequence

procedure fibseq()
    local i, j, n
    i := 1
    j := 1
    suspend (i |              # Generate the 1st value
             j |              # Generate the 2nd value
             |{               # Repeatedly generate the next values
                n := i + j    # This value is returned
                i := j        # i and j to the "next" value
                j := n                           
               } )
end   

...
every write(|fibseq() \20)         # Will generate the first 20 numbers of the fibonacci sequence

CheyenneWills 05:20, 28 November 2006 (UTC)

[edit] Generator in Ruby

Piet - why do you think that Ruby does not support generators (iterators) directly?

#!/usr/bin/ruby
#
def countdown(n)
  while n>=0
    yield n
    n-=1
  end
end

countdown(10) { |count|
  print "#{count}\n"
  print "Ignition!\n" if count==3
  print "Liftoff!\n"  if count==0
}

IMHO, unless I am required to "include" or "use" or "import" something to work with a functionality, it is supported by the language. I'd vote for including Ruby to the list of languages that support generators aka iterators. If you'd not have removed ruby twice from the language list I would simply insert it, but let's discuss this. --The emm 15:21, 29 August 2006 (UTC)

What's happening in the Ruby example is that you're creating and passing a block into countdown, which calls it for each value produced, and returns once at the end. Translated to Python (with some decorator syntax abuse), this would look more or less like:
from functools import partial

@partial(partial, partial)      # curry countdown
def countdown(n, block):
    while 0 <= n:
        block(n)
        n -= 1

@countdown(10)
def receive(count):
    print count
    if count == 3: print 'Ignition!'
    if count == 0: print 'Liftoff!'
By contrast, generators are called, and return/yield to their caller (which might be somewhere different each time), once for every value they produce. This means they can yield infinite sequences, and be composed together, passed around, and partially consumed, without worry. To try and illustrate:
def count():
    """Generate all the natural numbers."""
    n = 0
    while True:
        yield n
        n += 1

def divisible(seq, n):
    """Yield all items from seq that are divisible by n."""
    for x in seq:
        if x % n == 0:
            yield x

>>> evens = divisible(count(), 2)

>>> from itertools import islice
>>> list(islice(evens, 5))        # pull 5 items from evens
[0, 2, 4, 6, 8]

>>> for n in evens:
...     if n < 20:
...         print n,
...     else:
...         break
... 
10 12 14 16 18
>>> 
--Piet Delport 12:28, 30 August 2006 (UTC)
Iterators, generators and coroutines seem to be related but different. The coroutine article mentions: "Coroutines in which subsequent calls yield additional results are often known as generators." Is that the right way to define what a generator is? The article also has a link to a Japanese page about coroutines in Ruby. --TuukkaH 13:55, 30 August 2006 (UTC)
That's probably not entirely inaccurate, but it should be understood that, generally speaking, coroutines are a superset of generators. (A generator is more-or-less a suspendable/resumable stack frame; a coroutine is more-or-less a suspendable/resumable call stack.) --Piet Delport 19:21, 30 August 2006 (UTC)

[edit] Python primes generator contributed by banned user

It was contributed by User:PWnsivander the Great, who, while a nonobjectionable user, was blocked by User:Rspeer as a reincarnation of (unjustly, in the opinion of most of the board) banned user Mike Church. Shouldn't it, as the contribution of a banned user, be removed? Sentinel89 06:07, 13 March 2007 (UTC)

I would say no. The content should be judged on it's own merits regardless of any activity involving the user who initially contributed it. I've reviewed the code and see nothing wrong with it, and in fact believe it to be an great example for demonstrating Python generators (in terms of ease of understanding, not necessarily efficiency). My only hesitation would be the policy of WP:BAN, in that if these edits were made while the user was banned, then perhaps they could be reverted. But the policy doesn't say such edits have to be reverted, only that they may be. So I believe the content in this case is good enough to keep. Dmeranda 16:56, 16 March 2007 (UTC)

[edit] Fuzzy definition?

Coming from the STL, I find this definition quite fuzzy. Mixing generators and iterators seems strange, as I saw them serve different purposes:

  • generators are functions with no argument and returning an object, and are not referentially transparents
  • iterators are objects whose designate an element within a collection, and whose methods allow to access that element or designate another one, according to the access mode of the collection

Nowhere man 09:53, 24 April 2007 (UTC)

First of all, generators and iterators do have separate articles, specifically because they are not the same thing. However the external behavior of generators does share a lot in common with that of iterators, so it is worth discussing on this article. Note also that the C++/STL concept of an iterator is much richer and more specialized that the general concept of iterator; so the difference between the two will appear much more pronounced if you have a C++ perspective. It may make it easier to realize that a generator is basically an iterator over a collection; it's just that the collection happens to be algorithmically defined rather than having to physically exist in a memory-resident data structure. Also generators in as much as they act like iterators only support the most Allfundamental behavior; referencing the current element and getting to the next one (although the next function is implicit); it by no means acts like many rich C++-style iterators with all their additional behaviors. -- Dmeranda 16:25, 24 April 2007 (UTC)
OK, generators and iterators do have similarities, but the following text has no justification: "Generators are usually invoked inside loops. The first time that a generator invocation is reached in a loop, an iterator object is created that encapsulates the state of the generator routine at its beginning, with arguments bound to the corresponding parameters." Where does that come from? I never saw an iterator created when using a generator, be it in C++ or nay other language I use.
There are a lot of generators that are not meant to be used in loops, like random or prime numbers generators.
Also, generators' purpose is not to control loops. They are not to be mistaken as some sort of functional control structure! Nowhere man 18:50, 24 April 2007 (UTC)
Yes, that quoted sentence is perhaps misleading in that it implies a specific implementation of generators, using iterators, which doesn't necessarily have to be true. Note though as an example, in the Python language, invoking a generator function does in fact create a real iterator object (it has a next() method, and raises a StopIteration error when/if the generator terminates). But it's not fair to say that just because that's how Python does it that it is how all languages must do it. As far as other types of generators, yes there are undoubtedly a lot of them. Most fall into the more casual use of the Engligh word generator, as in something that produces something else. Many of these are listed on the Generator (disambiguation) article. It probably would be worthwhile to put a little reference to these other types someplace. In terms of the specific type of generator discussed in this article (which is iterator-like), use in loops is quite common. Although a prime number generator or random number generator could be implmented as an interator-style generator, they don't have to be as you pointed out. But that's not what this article is about. I guess we just need to tighten up some language. -- Dmeranda 19:43, 24 April 2007 (UTC)
Here's a python example showing how a generator really results is a true iterator (i.e., it obeys the standard Python iterator-protocol):
>>> def count_to_ten(n):    # Define a generator function
...     while n <= 10:
...         yield n
...         n = n + 1
...     return
... 
>>> gen = count_to_ten(7)   # Invoke the generator
>>> gen.next()              # Treat the invocation as an iterator object
7
>>> gen.next()
8
>>> gen.next()
9
>>> gen.next()
10
>>> gen.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Dmeranda 19:52, 24 April 2007 (UTC)

This article should not, as it is named, be about a specific subtype of generators, namely iterator-like generators. If you want to described them, and there is obviosuly much to say about them, then, fine, go ahead, but in a specific article. The article about generators in CS should about generators in CS. So what about refactoring this article in a general one (that I'm willing to write) and a specific one about iterator-like generators and their variants like list comprehensions? Nowhere man 20:25, 24 April 2007 (UTC)
Although I understand there is confusion that needs rectified, I disagree that this article's topic is too narrow in definition. Generator (computer science) generally agrees with most of the computer science literature in it's use of that term, as an iterator-like function. I've not seen any comp sci literature which defines what a generator (unqualified) is otherwise, except for the apparent convention of using the word in specific instances such as random number generator, code generator, name generator, character generator, etc. that are just a consequence of the common English meaning of the word, rather than a non-English teachnical meaning specific to the comp sci community. Many of those specific uses are listed on Generator (disambiguation) (although the disambig page seems to have been renamed to just Generator now). Would it suffice to just make a more verbose disambiguation statement at the top of the article, maybe something like:
This article is about a type of function in computer science that behaves like an iterator. For other uses of the term in computing in general see Generator.
You say that the article "generally agrees with most of the computer science literature in it's use of that term". Could you give references of actual CS publications going in this direction? The article in itself is in contradiction with this claim, as it takes PRNG as an example, whereas PRNG typically don't behave like iterators, but are simply non referentially transparent functions of no argument.
The first reference cited in the article says:
Generators first appeared in CLU (1975)[1]
Although, the given article, "Iteration Abstraction in Sather", doesn't deal with generators at all. It's very subjects is named iterators, and the term generator only appear a few times, in the last sections were the article makes comparisons with existing approaches of its subject. So long for the agreement of litterature in its use of that term...
Nowhere man 23:31, 24 April 2007 (UTC)
Would that work? We could of course try to invent a broader-scoped definition, but I'm hesitant to do so. -- Dmeranda 21:39, 24 April 2007 (UTC)

[edit] Needs verification

OK, I declared the article as needing verification from actual facts.

  • The CLU paper cited just talks about iterators, and compares them to generators.
  • Some contributor stated that the article "generally agrees with most of the computer science literature in it's use of that term", without being able to show any verifiable references.

I'll wait some time for people to be able to gather and show references to backup the article. If that doesn't happen, I'll edit it to describe the concept of generators in general. Nowhere man 12:54, 29 April 2007 (UTC)

I even found a contradictory source. In "Iterators: Signs of Weakness in Object-Oriented Languages" (1992), H. Baker states the following about iterators as found in CLU and C++:
Iterators are a 1980's version of the 1950's concept of generators. A generator is a subroutine which is called like a function, but returns a "new" value every time it is called (i.e., it is emphatically not a mathematical function). The "random number generator" is prototypic of a generator.
With this new information in hands, I will not wait to modify the article. Nowhere man 17:38, 29 April 2007 (UTC)

OK, I claimed the article was inconsistent with reachable publications, so with not a single attempt to counter that claim in nearly one year, I eventually rewrote the article. Nowhere man (talk) 21:18, 12 March 2008 (UTC)

I'm not sure that the claim was clear enough to provoke a counter.
Regarding Baker's paper, it's entirely consistent with the article. The sentence quoted is a simplified introduction, taken somewhat out of context; two paragraphs later, Baker unambiguously explains the difference between generators and functions/iterator classes:

One is tempted to think that every co-routine generator can be emulated by a function which has access to some persistent local state (e.g., an Algol-60 "own" variable), but this is not the case. The standard counter-example is the "samefringe" problem, which can be solved by two co-routined "fringe" generators, but cannot be solved by a function with only a bounded amount of persistent local state.

The paper as a whole is an argument against iterator classes, and for functional generators. Piet Delport 2008-03-18 06:30
How could I make the claim more clear: a banner, examination of cited papers and quotation with links to another contradictory paper! As for Baker's paper, the quote you give doesn't say that generators have any relationship with loops or control structures, and clearly state that a PRNG is a classical example of generator. So I maintain that the article is plain false. It still cites papers that are contradictory to itself. If you can find a reachable reference that clearly support the old version, there's room to improve it based on those references. Until then, I'll revert to my modification.
I gave plenty of time to counter my claim, so don't revert to the previous state of the article without a strong reference. just saying that noone saw the long-standing claim isn't enough. Nowhere man (talk) 21:24, 25 March 2008 (UTC)
I find the paper clear, unambiguous, and non-contradictory. It meets all the qualifications of a reliable source. If you disagree with the paper, Wikipedia is not the avenue to pursue it: Wikipedia's threshold for inclusion is verifiability, not truth, and not original research.
Regarding your individual points:
  • I don't know what you're trying to imply about the text i quoted; it doesn't directly say anything about generators and their relationship with loops / control structures, but that's the subject of the rest of the paper.
  • Baker's paper does not say that PRNGs are examples of generators, it says that PRNGs are prototypic of generators, which includes the sense of being precursory or inspirational to. (Indeed, PRNGs are generally well-implementable both as "plain" procedures and as generators in the sense of this article.) You could quibble over which sense of "prototypic" was meant in the introductory sentence, taken out of context, but any doubt is rendered moot by the body of the paper, which is quite precise, and never mentions PRNGs.
  • The article is not backed by the Baker citation alone, but by Kiselyov, Omohundro (Sather), Liskov (CLU), Shaw (Alphard) and the rest (Python, C#, JavaScript). There are no grounds for reverting the article without a convincing argument that these citations are all invalid.
If you wish to continue this further, the next dispute resolution step is probably to solicit a third opinion, or informal mediation. Piet Delport 2008-04-01 00:07

[edit] Suggestion for clarifications, definitions, etc.

The Icon programming language, while not the first language to implement generators, was one of the first to fully integrate the concept into the language. Within Icon, generators are not only used within the scope of loops, but are available throughout the language. In addition while in a certain sense one could say that a generator returns a list of values, a better term is that a generator returns a result sequence. Specifically, a generator could return an infinite number of values (e.g. a random number generator).

There was a lot of research and many papers written on generators. In doing a date sorted search of the string "generator" in the ACM portal, the earliest reference I found (outside random number generators, and a couple of references to code generators) was

Mary Shaw, William A. Wulf, Ralph L. London
August 1977             
Communications of the ACM,  Volume 20 Issue 8
Publisher: ACM Press
Full text available:    pdf(1.28 MB)
        
Additional Information: full citation, abstract, references, citings
The Alphard “form” provides the programmer with a great deal of control 
over the implementation of abstract data types. In this paper the 
abstraction techniques are extended from simple data representation and 
function definition to the iteration statement, the most important point 
of interaction between data and the control structure of the language 
itself. A means of specializing Alphard's loops to operate on abstract 
entities without explicit dependence on the representation ...

Keywords: abstraction and representation, abstraction data types, 
assertions, control specialization, correctness, generators, invariants, 
iteration statements, modular decomposition, program specifications, 
programming languages, programming methodology, proofs of correctness, 
types, verification 

Of interest, is that immediately following the above article in the same issue of the Communications of the ACM, was an article on CLU.

The next real reference to generators was the article written by the late Dr Ralph Griswold

Generators in Icon
Full text       pdf formatPdf (1.03 MB)
Source  ACM Transactions on Programming Languages and Systems (TOPLAS) archive
Volume 3 ,  Issue 2  (April 1981) table of contents
Pages: 144 - 161  
Year of Publication: 1981
ISSN:0164-0925
Authors         
Ralph E. Griswold        Department of Computer Science, The University of Arizona, Tucson, AZ
David R. Hanson          Department of Computer Science, The University of Arizona, Tucson, AZ
John T. Korb     Xerox Corporation, 3333 Coyote Hill Road, Palo Alto, CA
Publisher       
ACM Press   New York, NY, USA 

Quoting from the above

5. RELATED WORK
The concept of generators has appeared in various forms in many languages. An
early language that has generators is IPL-V [31], in which a generator is a
subroutine that calls a processing routine to operate on each object in a data
structure. Other languages, such as ALGOL 68 [38] and COBOL, use the term
generator to describe various language features, but this use is unrelated to the
Icon notion.

Languages that support coroutines frequently use the term generator to describe
a particular coroutine usage. SIMULA [1] is the oldest of such languages;
more recent examples include extensions to POP-2 [24] and SL5 [20] and
coroutine additions to PASCAL [27]. There also has been a substantial amount
of work on the "lazy evaluation" approach to the incremental generation of
sequences, usually in LISP or APL frameworks [7, 17, 22].

More recently, ALPHARD [43] and CLU [28] have been developed to support
data structure abstraction. In these languages generators are used to iterate over
the elements of programmer-defined data structures [35]. Included with the
definition of a data abstraction (or "cluster" in CLU) is a procedure for generating
the elements of the abstraction. In CLU, this procedure, called an iterator,
produces its values using the yield statement. When another element is required,
execution of the iterator continues where the yield statement left off. In each of
these languages, generators are only accessible in a specific context: a particular
type of for statement. This is significantly more restricted than in Icon, where
generators may appear anywhere.

Superficially, generators in ALPHARD and CLU appear to correspond to the
every expression in Icon. But while every has the syntactic appearance of a
standard control structure, the nonstandard goal-directed evaluation mechanism
makes its semantics more general than the corresponding ALPHARD and CLU
constructs. It is the restricted applicability and the lack of goal-directed evaluation
that differentiate these constructs from generators in Icon.

ACM Transactions on Programming Languages and Systems, Vol. 3, No. 2, April 1981.
158 R.E. Griswold, D. R. Hanson, and J. T. Korb

CheyenneWills 20:49, 25 June 2007 (UTC)

[edit] C++ generators

I don't think C++ properly supports generators. There is no way to have the compiler save the stack frame; as in the example, the object must remember its own state. Overloading the function call operator makes it look somewhat like a (hypothetical) C++ generator, but a constructor call is still needed. It's close, but this can done, with vary degrees of syntactical success, in most object-oriented languages. —Preceding unsigned comment added by 67.122.120.62 (talk) 10:49, 2 March 2008 (UTC)

Right, the section described an iterator class, not a generator. I removed it. Piet Delport 2008-03-18 06:35