Homogenous Computations: Thoughts on Generator Coroutines#
Hello, everyone and welcome back to Cameron’s Corner! This week, I have a treat. We received a fantastic question in our Discord Server—which you should join if you haven’t yet—about the usage of a generator coroutine
in Python. Specifically, the question sought to disambiguate the call of __next__
and .send(None)
on a generator instance.
Before I get started, I want to remind you about the seminar coming up tomorrow, September 7th, titled, “How Do I Write “Constructively” Correct Code with Metaclasses & Decorators?” Join James Powell as he delves into the powerful concept of leveraging Python’s object construction mechanism to enforce code correctness. Discover how metaclasses, decorators, and other language features can be used to validate and coerce input data, define selective object hierarchies, and implement abstract base classes.
Back to the question at hand. While there are workarounds to this problem, none of them feel very optimal and involve contorting Python to your will. This led James to create a thoughtful response, laying out the approach we take with generators, linking it to forms of encoding (in-band vs out-of-band), and demonstrating how we can use this thinking to guide our application of generator coroutines. This week, I wanted to share some of that Q&A with you all.
The Question:#
I have a specific doubt in coroutines.
How do I stop someone from doing a
coro.send(None)
? I want my coro to behave differently during anext(coro)
as opposed to acoro.send(something)
.For more context, I have a wrapper for a time tracker:
def track2(iterable, *, total_steps=None):
total = len(iterable) if total_steps is None else total_steps
timer = Timer(total_steps)
for item in iterable:
info = yield item
timer(info=info)
if info is not None:
yield # Just to ensure the send operation stops and wait for the actual __next__ call
and, during looping, I want to log something if they send something meaningful, otherwise just go to the next step:
for i in (tracker:=track2([1,2,3,4,5])):
time.sleep(0.1)
info = f'My Info: {i}'
tracker.send(info) # THIS MUST BE OPTIONAL, i.e., if .send is not called go for the next iteration
But, in the above code, if someone sends a None to tracker, that will basically waste the next iteration, and it just fails silently.
Any insight will be appreciated.
Answer#
When we talk about generators and coroutines, we often describe them as the consequence of adding structure to a computation.
Background#
in-band vs out-of-band encoding
Consider the challenge of representing (or “encoding”) three entities that we want to automate over: “Alice,” “Bob,” and “Charlie.” We can represent these three entities in a variety of ways.
We could encode these data as a delimiter-separated str
:
# encode the entities
entities = 'Alice,Bob,Charlie'
# automate (i.e., iterate) over the entities
for ent in entities.split(','):
print(f'{ent = }')
ent = 'Alice'
ent = 'Bob'
ent = 'Charlie'
Or, we could encode these data using a list
:
# encode the entities
entities = ['Alice', 'Bob', 'Charlie']
# automate (i.e., iterate) over the entities
for ent in entities:
print(f'{ent = }')
ent = 'Alice'
ent = 'Bob'
ent = 'Charlie'
The above choices differ formally: one uses
str
and the other list
. We are interested in understanding the deepest
difference between these two choices—what differences are present beyond the
superficial choice of Python data type?
In the scope of the choices we can make, consider that choosing between str
and list
for representing this data is, indeed, quite superficial.
What does this have to do with generators?#
Well, generators are a way for us to add an out-of-band structuring to a computation.
For example, the following function computes three results…
def f(data):
x = data + 1
y = data * 2
z = data ** 3
return x, y, z
a, b, c = f(123)
and the following generator computes the same three results, but it allows us to delineate each ‘step’ of the computation…
def g(data):
yield (x := data + 1)
yield (y := data * 2)
yield (z := data ** 3)
gi = g(123)
a = next(gi)
b = next(gi)
c = next(gi)
Note that, in the above, each “part” or “step” of the computation corresponds to
one line of the source code. But, if we look at this function with the dis
module, we can clearly see an alternate interpretation—each part of the
computation corresponds to one Python bytecode.
from dis import dis
def f(data):
x = data + 1
y = data * 2
z = data ** 3
return x, y, z
dis(f)
4 0 LOAD_FAST 0 (data)
2 LOAD_CONST 1 (1)
4 BINARY_ADD
6 STORE_FAST 1 (x)
5 8 LOAD_FAST 0 (data)
10 LOAD_CONST 2 (2)
12 BINARY_MULTIPLY
14 STORE_FAST 2 (y)
6 16 LOAD_FAST 0 (data)
18 LOAD_CONST 3 (3)
20 BINARY_POWER
22 STORE_FAST 3 (z)
7 24 LOAD_FAST 1 (x)
26 LOAD_FAST 2 (y)
28 LOAD_FAST 3 (z)
30 BUILD_TUPLE 3
32 RETURN_VALUE
In fact, for the purposes of the Python interpreter’s Global Interpreter Lock (“GIL,”), each (atomic) “step” of this computation is, indeed, each bytecode: the GIL is a coarse-grained lock that locks the interpreter on the scope of one bytecode loop so that only one thread can execute at a time.
(Note that one line of Python source code readily requires multiple bytecodes, and, in the Python threading model, threads are preempted at the bytecode level. In other words, in the Python threading model, threads do not guarantee that the execution of a line of Python source code is atomic.)
In practice, executing one line of Python source code may involve the
execution of multiple Python bytecodes that are not visible or available for
programmers to use. I am not aware of any guarantees that the Python core
developers provide to end-users on how Python bytecodes are generated or how
they map to source text. As a result, the authors of code transformation or
deep metaprogramming frameworks may choose to perform transformations at the
abstract syntax tree (“AST”) level (supported by the standard library’s ast
module) rather than risk chasing after version-to-version changes that occur at
the bytecode level.
Lazy Out-of-band Computation#
Hello, everyone! Welcome back to Cameron’s Corner. This week, I want to continue with a question we received in our Discord Server—which you should join if you haven’t yet—about the usage of a generator coroutine in Python. Specifically, the question sought to disambiguate the call of next and .send(None) on a generator instance.
Let’s pick up where we left off…
A generator (or generator coroutine) is how we take a computation and break it down into parts so that we can do something useful with this decomposition.
%%timeit -n1 -r 1
from time import sleep
from random import Random
def compute(x):
''' does something slowly '''
sleep(.1)
return x ** 3
def process(dataset):
rv = []
for x in dataset:
rv.append(compute(x))
return rv
if __name__ == '__main__':
rnd = Random(0)
dataset = [rnd.randint(-100, +100) for _ in range(10)]
# find the first three positive values
results = []
for x in process(dataset):
if x >= 0:
results.append(x)
if len(results) == 3:
break
print(f'{results = }')
results = [830584, 343, 27000]
1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
And, because the process
computation is one indivisible
“part,” we are forced to “eagerly” compute values for the entire dataset, even
though we only want a small subset of these values. As a result, we waste
significant memory and computational time.
However, if we add some structuring to this computation…
%%timeit -n 1 -r 1
from time import sleep
from random import Random
def compute(x):
''' does something slowly '''
sleep(.1)
return x ** 3
def process(dataset):
for x in dataset:
yield compute(x)
if __name__ == '__main__':
rnd = Random(0)
dataset = [rnd.randint(-100, +100) for _ in range(10)]
# find the first three positive values
results = []
for x in process(dataset):
if x >= 0:
results.append(x)
if len(results) == 3:
break
print(f'{results = }')
results = [830584, 343, 27000]
601 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
then we can use this structuring for a purpose. Instead of “eagerly” computing all values for the entire dataset, we can “lazily” compute only the values we desire. As a result, we use only the exact amount of memory and exact amount of computational time necessary.
In the above example, we identify that each iteration through for x in dataset
constitutes one “step” of the computation, and we indicate this with
the yield
keyword.
We could consider the yield
keyword in the body of a Python generators to be
the equivalent of the ,
in a list
. It unambiguously delineates where each
computation begins and ends.
The advantages of structuring a computation go far beyond reducing resource use, which is why generators and generator coroutines are such a fundamental approach with such extreme power.
If we can accept that a generator is an out-of-band structure for a
computation serving some useful purpose for the end-user (i.e., computation = [step, step, step]
), then we can switch our attention to
describing the structuring provided.
A list
represents a linear structuring with nesting. In other words, we can
proceed through the components of a list
only in a sequential, forwards or backwards ordering.
xs = ['a', 'b', 'c']
for x in xs:
print(f'{x = }')
for x in reversed(xs):
print(f'{x = }')
x = 'a'
x = 'b'
x = 'c'
x = 'c'
x = 'b'
x = 'a'
Note that the use of sorted
does not constitute another kind of ordering because
sorted
returns a new list
, over which we iterate sequentially in a forward
order. Similarly, the Python list
provides us with fast (constant-time)
random access, from which we may construct something that appears to be a
non-sequential ordering. For example…
from random import Random
def randomed(xs, *, random_state=None):
rnd = Random() if random_state is None else random_state
idxs = [*range(len(xs))]
rnd.shuffle(idxs)
return [xs[idx] for idx in idxs]
xs = ['a', 'b', 'c']
for x in randomed(xs, random_state=Random(0)):
print(f'{x = }')
x = 'a'
x = 'c'
x = 'b'
However, it should be clear that there is a level of indirection here, and
an alternate implementation of list
(e.g., a linked list implementation)
would not provide us this capability while providing very similar
functionality.
A numpy.ndarray
is a container that provides non-linear orderings.
from numpy.random import default_rng
rng = default_rng(0)
xs = rng.integers(-10, +10, size=(3, 3))
for x in xs: # iterate over rows
print(f'{x = }')
for x in xs.T: # iterate over columns
print(f'{x = }')
x = array([7, 2, 0])
x = array([ -5, -4, -10])
x = array([ -9, -10, -7])
x = array([ 7, -5, -9])
x = array([ 2, -4, -10])
x = array([ 0, -10, -7])
Note that, since we can specify whether the numpy.ndarray
is stored in
Fortran-style column-major (‘colexicographical’) order or
in C-style row-major (‘lexicographical’) order, the consideration that memory addresses are
fundamentally linear is irrelevant. Neither ordering is
guaranteed to be more “native,” more efficient, or “closer to the machine.”
A generator (or generator coroutine) provides an ordering over the “steps” of a
computation, but, unlike list
, it allows only for forward iteration. It is not
meaningful to iterate in a backward direction over the steps of a computation;
you cannot reversed(…)
a generator.
If we consider only linearly ordered structures—and consider only forward iterations of such structures—we can discover another important distinction in our container types.
Wrap Up#
That’s all we have time for this week! I would encourage you all to think deeply about this perspective. It is extremely beneficial when thinking about computation in ANY programming language because most languages have abstractions centered around each of these ideas. Next week, we’ll discuss homogeneous and heterogeneous orderings and how they add yet another piece of the puzzle.