Generators in Python
One of the nice things about Python is that it comes with in-built support for “generators”, functions that can suspend themselves and be resumed in the middle of processing. Here’s a generator that produces the fibonacci series:
A generator is instantiated every time you call the
Once you have a generator object, you can run it until the next time
it suspends itself by passing it to the
The very first time we call
fib, all of the code within it
runs and gets suspended at the
yield, returning the first value.
Every time we call
next on it after that, it resumes execution after
yield, which happens to be in a
while loop, so execution jumps
to the top of the loop.
y are modified and the function
yields again, suspending itself and returning the new value for
This can go on and on until the machine runs out of memory.
Another property of generators is that you can send values into them every time they are resumed. Here’s a generator that computes the sum of two numbers sent from the outside:
When you run this generator, execution first suspends before
assigned and then it suspends again before
y is assigned.
Think of the
send method as a lower level way of resuming a
next, with the added benefit that it allows you to
send values into the generator (whereas
next always sends
Above, I first call
None, which signals to the generator
that it can start executing. Then I send it the value that should be
x, followed by the value for
y. Once it hits the
return statement, the generator raises
StopIteration with the
Generators in Racket
Racket’s standard library also comes with built-in support for generators via the racket/generator module. But what if it didn’t? Could we implement Python-like generators in the language itself? The answer, of course, is “yes.”
Racket comes with an even more general mechanism for capturing and resuming computation, called continuations. Continuations let you capture the execution of a program at a particular point as a function. When applied, that function will cause the program flow to jump back to the point it refers to. Here’s an example:
call/cc captures the current continuation and passes it to another
function – in this case the lambda we created – and, as mentioned
above, when the continuation,
k, is applied the program will jump to
the spot that
call/cc was itself applied at, effectively “return"ing
from it with the value passed into the continuation.
So the above program:
- evaluates the
call/cc, reaching the line where
- jumps back to the point where it started evaluating the
- replaces the entire
call/ccblock with the value passed to
- evaluates the
3, and, finally,
(+ 1 2 3).
(displayln "after k is applied") expression is never evaluated
by this program.
Here’s another example, where we escape an infinite loop by way of a continuation:
- evaluate the
call/ccand capture the current continuation,
- recursively loop until
5, when we apply
- jump back to the point where
call/ccwas called and replace it with the value passed to
k, which is
Here’s the same program with
k renamed to
Does that look familiar?
To implement our generators, we’re going to use a more general type of continuations called “delimited continuations.” These types of continuations allow us to install “prompts”, continuation frames associated with specific prompt tags, and then later on in the execution of the program, abort to the nearest enclosing prompt with a particular tag without needing to have a reference to the captured continuation.
Here’s an example where I install a prompt and then abort to it:
abort-current-continuation is applied, execution jumps to the
abort handler (the third argument to
and then returns from the point where
is applied. So the above program prints
That shared prompt tag is how the abort function knows which continuation prompt it should jump to.
With this in mind, we can define a prompt tag for our continuations:
Followed by the
yield looks up the value of the current yielder and then applies its
arguments to that function. The default value of
raises an error so that
yield will fail if applied outside of a
Next, we implement the
generator function itself:
This function takes another function,
proc, as an argument and
returns a function that will create a generator “instance” (really,
it’s just another function!) when applied.
Let’s break down what happens inside the generator “factory” that is
created. First, the original function is assigned to
an inner function called
next is defined.
next installs a new continuation prompt with the
yield-tag. Inside the extent of the prompt,
next then sets
the current yielder to
yield, which we’ll talk about in a second,
and returns the value of applying
cont to its arguments.
yield function that
next references is defined. When
applied, it captures its own continuation (the continuation of the
place that it itself was applied at), it then updates
cont to point
to that continuation and, finally, it aborts to the nearest prompt
yield-tag (i.e. the place where the
next function was
applied), passing a function that returns
yield’s arguments to the
continuation prompt handler that
installed. Because we didn’t pass a handler function to
call-with-continuation-prompt, the default handler, which applies
whatever argument it’s given, is used.
next function is returned from the “factory.”
And that’s it! That’s pretty much all you need to do to get a working implementation of generators.
Here’s the definition of our
fib generator built using this function:
and here it is in use:
And here’s the
Pretty cool, eh?
Here’s a video of me stepping through the
fib example with a debugger:
P.S. This is actually pretty close to how the generator implementation in Racket itself works. Feel free to check that out, the main differences between the two stem from some additional error handling that the core implementation does.