Homogenous Computations: Thoughts on Generator Coroutines#

Hello, everyone and welcome back to Cameron’s Corner! This week, I have a treat. We received a fantastic question in our Discord Server—which you should join if you haven’t yet—about the usage of a generator coroutine in Python. Specifically, the question sought to disambiguate the call of __next__ and .send(None) on a generator instance.

Before I get started, I want to remind you about the seminar coming up tomorrow, September 7th, titled, “How Do I Write “Constructively” Correct Code with Metaclasses & Decorators?” Join James Powell as he delves into the powerful concept of leveraging Python’s object construction mechanism to enforce code correctness. Discover how metaclasses, decorators, and other language features can be used to validate and coerce input data, define selective object hierarchies, and implement abstract base classes.

Back to the question at hand. While there are workarounds to this problem, none of them feel very optimal and involve contorting Python to your will. This led James to create a thoughtful response, laying out the approach we take with generators, linking it to forms of encoding (in-band vs out-of-band), and demonstrating how we can use this thinking to guide our application of generator coroutines. This week, I wanted to share some of that Q&A with you all.

The Question:#

I have a specific doubt in coroutines.

How do I stop someone from doing a coro.send(None)? I want my coro to behave differently during a next(coro) as opposed to a coro.send(something).

For more context, I have a wrapper for a time tracker:

def track2(iterable, *, total_steps=None):
    total = len(iterable) if total_steps is None else total_steps
    timer = Timer(total_steps)
    for item in iterable:
        info = yield item
        timer(info=info)
        if info is not None:
            yield # Just to ensure the send operation stops and wait for the actual __next__ call

and, during looping, I want to log something if they send something meaningful, otherwise just go to the next step:

for i in (tracker:=track2([1,2,3,4,5])):
    time.sleep(0.1)
    info = f'My Info: {i}'
    tracker.send(info) # THIS MUST BE OPTIONAL, i.e., if .send is not called go for the next iteration

But, in the above code, if someone sends a None to tracker, that will basically waste the next iteration, and it just fails silently.

Any insight will be appreciated.

Answer#

When we talk about generators and coroutines, we often describe them as the consequence of adding structure to a computation.

Background#

in-band vs out-of-band encoding

Consider the challenge of representing (or “encoding”) three entities that we want to automate over: “Alice,” “Bob,” and “Charlie.” We can represent these three entities in a variety of ways.

We could encode these data as a delimiter-separated str:

# encode the entities
entities = 'Alice,Bob,Charlie'

# automate (i.e., iterate) over the entities
for ent in entities.split(','):
    print(f'{ent = }')
ent = 'Alice'
ent = 'Bob'
ent = 'Charlie'

Or, we could encode these data using a list:

# encode the entities
entities = ['Alice', 'Bob', 'Charlie']

# automate (i.e., iterate) over the entities
for ent in entities:
    print(f'{ent = }')
ent = 'Alice'
ent = 'Bob'
ent = 'Charlie'

The above choices differ formally: one uses str and the other list. We are interested in understanding the deepest difference between these two choices—what differences are present beyond the superficial choice of Python data type?

In the scope of the choices we can make, consider that choosing between str and list for representing this data is, indeed, quite superficial.

What does this have to do with generators?#

Well, generators are a way for us to add an out-of-band structuring to a computation.

For example, the following function computes three results…

def f(data):
    x = data + 1
    y = data * 2
    z = data ** 3
    return x, y, z

a, b, c = f(123)

and the following generator computes the same three results, but it allows us to delineate each ‘step’ of the computation…

def g(data):
    yield (x := data + 1)
    yield (y := data * 2)
    yield (z := data ** 3)

gi = g(123)
a = next(gi)
b = next(gi)
c = next(gi)

Note that, in the above, each “part” or “step” of the computation corresponds to one line of the source code. But, if we look at this function with the dis module, we can clearly see an alternate interpretation—each part of the computation corresponds to one Python bytecode.

from dis import dis

def f(data):
    x = data + 1
    y = data * 2
    z = data ** 3
    return x, y, z

dis(f)
  4           0 LOAD_FAST                0 (data)
              2 LOAD_CONST               1 (1)
              4 BINARY_ADD
              6 STORE_FAST               1 (x)

  5           8 LOAD_FAST                0 (data)
             10 LOAD_CONST               2 (2)
             12 BINARY_MULTIPLY
             14 STORE_FAST               2 (y)

  6          16 LOAD_FAST                0 (data)
             18 LOAD_CONST               3 (3)
             20 BINARY_POWER
             22 STORE_FAST               3 (z)

  7          24 LOAD_FAST                1 (x)
             26 LOAD_FAST                2 (y)
             28 LOAD_FAST                3 (z)
             30 BUILD_TUPLE              3
             32 RETURN_VALUE

In fact, for the purposes of the Python interpreter’s Global Interpreter Lock (“GIL,”), each (atomic) “step” of this computation is, indeed, each bytecode: the GIL is a coarse-grained lock that locks the interpreter on the scope of one bytecode loop so that only one thread can execute at a time.

(Note that one line of Python source code readily requires multiple bytecodes, and, in the Python threading model, threads are preempted at the bytecode level. In other words, in the Python threading model, threads do not guarantee that the execution of a line of Python source code is atomic.)

In practice, executing one line of Python source code may involve the execution of multiple Python bytecodes that are not visible or available for programmers to use. I am not aware of any guarantees that the Python core developers provide to end-users on how Python bytecodes are generated or how they map to source text. As a result, the authors of code transformation or deep metaprogramming frameworks may choose to perform transformations at the abstract syntax tree (“AST”) level (supported by the standard library’s ast module) rather than risk chasing after version-to-version changes that occur at the bytecode level.

Lazy Out-of-band Computation#

Hello, everyone! Welcome back to Cameron’s Corner. This week, I want to continue with a question we received in our Discord Server—which you should join if you haven’t yet—about the usage of a generator coroutine in Python. Specifically, the question sought to disambiguate the call of next and .send(None) on a generator instance.

Let’s pick up where we left off…

A generator (or generator coroutine) is how we take a computation and break it down into parts so that we can do something useful with this decomposition.

%%timeit -n1 -r 1

from time import sleep
from random import Random

def compute(x):
    ''' does something slowly '''
    sleep(.1)
    return x ** 3

def process(dataset):
    rv = []
    for x in dataset:
        rv.append(compute(x))
    return rv

if __name__ == '__main__':
    rnd = Random(0)
    dataset = [rnd.randint(-100, +100) for _ in range(10)]

    # find the first three positive values
    results = []
    for x in process(dataset):
        if x >= 0:
            results.append(x)
        if len(results) == 3:
            break
    print(f'{results = }')
results = [830584, 343, 27000]
1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

And, because the process computation is one indivisible “part,” we are forced to “eagerly” compute values for the entire dataset, even though we only want a small subset of these values. As a result, we waste significant memory and computational time.

However, if we add some structuring to this computation…

%%timeit -n 1 -r 1
from time import sleep
from random import Random

def compute(x):
    ''' does something slowly '''
    sleep(.1)
    return x ** 3

def process(dataset):
    for x in dataset:
        yield compute(x)

if __name__ == '__main__':
    rnd = Random(0)
    dataset = [rnd.randint(-100, +100) for _ in range(10)]

    # find the first three positive values
    results = []
    for x in process(dataset):
        if x >= 0:
            results.append(x)
        if len(results) == 3:
            break
    print(f'{results = }')
results = [830584, 343, 27000]
601 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

then we can use this structuring for a purpose. Instead of “eagerly” computing all values for the entire dataset, we can “lazily” compute only the values we desire. As a result, we use only the exact amount of memory and exact amount of computational time necessary.

In the above example, we identify that each iteration through for x in dataset constitutes one “step” of the computation, and we indicate this with the yield keyword.

We could consider the yield keyword in the body of a Python generators to be the equivalent of the , in a list. It unambiguously delineates where each computation begins and ends.

The advantages of structuring a computation go far beyond reducing resource use, which is why generators and generator coroutines are such a fundamental approach with such extreme power.

If we can accept that a generator is an out-of-band structure for a computation serving some useful purpose for the end-user (i.e., computation = [step, step, step]), then we can switch our attention to describing the structuring provided.

A list represents a linear structuring with nesting. In other words, we can proceed through the components of a list only in a sequential, forwards or backwards ordering.

xs = ['a', 'b', 'c']

for x in xs:
    print(f'{x = }')

for x in reversed(xs):
    print(f'{x = }')
x = 'a'
x = 'b'
x = 'c'
x = 'c'
x = 'b'
x = 'a'

Note that the use of sorted does not constitute another kind of ordering because sorted returns a new list, over which we iterate sequentially in a forward order. Similarly, the Python list provides us with fast (constant-time) random access, from which we may construct something that appears to be a non-sequential ordering. For example…

from random import Random

def randomed(xs, *, random_state=None):
    rnd = Random() if random_state is None else random_state
    idxs = [*range(len(xs))]
    rnd.shuffle(idxs)
    return [xs[idx] for idx in idxs]

xs = ['a', 'b', 'c']

for x in randomed(xs, random_state=Random(0)):
    print(f'{x = }')
x = 'a'
x = 'c'
x = 'b'

However, it should be clear that there is a level of indirection here, and an alternate implementation of list (e.g., a linked list implementation) would not provide us this capability while providing very similar functionality.

A numpy.ndarray is a container that provides non-linear orderings.

from numpy.random import default_rng

rng = default_rng(0)

xs = rng.integers(-10, +10, size=(3, 3))

for x in xs: # iterate over rows
    print(f'{x = }')

for x in xs.T: # iterate over columns
    print(f'{x = }')
x = array([7, 2, 0])
x = array([ -5,  -4, -10])
x = array([ -9, -10,  -7])
x = array([ 7, -5, -9])
x = array([  2,  -4, -10])
x = array([  0, -10,  -7])

Note that, since we can specify whether the numpy.ndarray is stored in Fortran-style column-major (‘colexicographical’) order or in C-style row-major (‘lexicographical’) order, the consideration that memory addresses are fundamentally linear is irrelevant. Neither ordering is guaranteed to be more “native,” more efficient, or “closer to the machine.”

A generator (or generator coroutine) provides an ordering over the “steps” of a computation, but, unlike list, it allows only for forward iteration. It is not meaningful to iterate in a backward direction over the steps of a computation; you cannot reversed(…) a generator.

If we consider only linearly ordered structures—and consider only forward iterations of such structures—we can discover another important distinction in our container types.

Wrap Up#

That’s all we have time for this week! I would encourage you all to think deeply about this perspective. It is extremely beneficial when thinking about computation in ANY programming language because most languages have abstractions centered around each of these ideas. Next week, we’ll discuss homogeneous and heterogeneous orderings and how they add yet another piece of the puzzle.