Commutativity in Python: When a == b ≠ b == a

This week, I encountered a fun bug while working on Narwhals. Specifically, for the Modin backend, I was unable to change the type of a Series to a categorical type. What's interesting is that the following conversion should work:

import modin.pandas as mpd
# import modin; modin.__version__ → 0.32.0

s = mpd.Series(['a', 'b', 'c'])
s.astype('category')                                       # Works fine
s.astype(mpd.CategoricalDtype(categories=['a', 'b', 'c'])) # Error!

You would expect this to work for a drop-in replacement to pandas given the following:

import pandas as pd
# pd.__version__ → 2.2.3

s = pd.Series(['a', 'b', 'c'])
s.astype('category')                                      # works!
s.astype(pd.CategoricalDtype(categories=['a', 'b', 'c'])) # also works!

0    a
1    b
2    c
dtype: category
Categories (3, object): ['a', 'b', 'c']

So why does the latter work and the former does not? Well, the hint is in the title of this post—so we know it has to do with equality checks. Here's a quick sample:

# pd.__version__ → 2.2.3

dtype = pd.CategoricalDtype(categories=['a', 'b', 'c'])
df = (
    pd.DataFrame({
        'col1': [*'abc'],
        'col2': [*'xyz'],
        'col3': [*range(3)],
    })
    .astype({
        'col1': dtype,
        'col2': 'category',
    })
)

df

	col1	col2	col3
0	a	x	0
1	b	y	1
2	c	z	2

df.dtypes == dtype

col1     True
col2    False
col3    False
dtype: bool

dtype == df.dtypes # why is this different than the above?!

False

You can see that something odd is happening with the ==… it's not commutative! Before we go into the specific issue, let's dive into Python's equality == operator and how it works because it goes much deeper than you would think.

In mathematics, we learn that equality is commutative: for any numbers $a$ and $b$, the equation $a = b$ is the same as $b = a$. However, in Python, things aren’t always so straightforward. While many built-in types behave as expected when comparing equality (i.e., a == b is the same as b == a), Python's behavior with user-defined objects can be more complex. This is because Python allows us to define custom equality behavior through the special method __eq__.

The Equality Checks You Know

Take a look at how Python handles equality for simple built-in types.

x, y = 10, 3

print(
    f'{x == y = }',
    f'{y == x = }',
    sep='\n',
)

x == y = False
y == x = False

Here, when we compare the integers x and y using ==, Python correctly evaluates both x == y and y == x as False. This is consistent with the commutative property of equality.

Now, let’s see what happens when we use equality with strings.

x, y = 'abc', 'def'

print(
    f'{x == y = }',
    f'{y == x = }',
    sep='\n',
)

x == y = False
y == x = False

As we expect, x == y and y == x both return False, demonstrating that string comparison behaves commutatively as well.

But what happens when we start using custom objects and define our own equality behavior? Let's look at a class that implements __eq__ to control how equality is determined.

from dataclasses import dataclass

@dataclass
class T:
    value: int

    def __eq__(self, other):
        return self.value == other.value
print(
    f'{T(10) == T(10) = }',
    f'{T(20) == T(10) = }',
    f'{T(10) == T(20) = }',
    sep='\n',
)

T(10) == T(10) = True
T(20) == T(10) = False
T(10) == T(20) = False

In this case, we've defined a custom __eq__ method for the class T, which compares the value attribute of instances. Notice that T(10) == T(10) returns True, and both T(20) == T(10) and T(10) == T(10) returns False. Here, equality behaves as expected—it's commutative, meaning we can re-order our operands and obtain a consistent result.

Under the Hood

However, things get more interesting when we change the behavior of __eq__ in a subclass. Let’s override __eq__ in a derived class to see if equality can still be commutative.

from functools import wraps

def log(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print(f'{func.__qualname__}{args!r}')
        return func(*args, **kwargs)
    return wrapper

@dataclass
class Base:
    @log
    def __eq__(self, other):
        return False

@dataclass
class Derived(Base):
    @log
    def __eq__(self, other):
        return True

class Unrelated:
    @log
    def __eq__(self, other):
        return True

base      = Base()
derived   = Derived()
unrelated = Unrelated()

Here, the base class Base has a __eq__ method that always returns False. The Derived class, however, overrides __eq__ to always return True. And finally, we have a completely Unrelated class whose __eq__ method always returns True Let’s see how this affects equality comparisons.

Base == Unrelated

base == unrelated

Base.__eq__(Base(), <__main__.Unrelated object at 0x772bd7555e20>)

False

unrelated == base

Unrelated.__eq__(<__main__.Unrelated object at 0x772bd7555e20>, Base())

True

Well, we just broke the mathematical commutative property. By defining custom __eq__ methods we can easily break this logic when comparing two custom objects. And you can see that we end up calling left.__eq__(right) each time we use the equality operator (==).

But why doesn't this error happen more often in our code? It turns out that there are a few cases where Python will skip the pattern of left.__eq__(right) in favor for right.__eq__(left)!

Base == Built-In

Some (but not all) built-in data structures can signal back to Python to use the reflective function call right.__eq__(left). The standard integer is one of them.

base == 1   # left.__eq__(right)

Base.__eq__(Base(), 1)

False

1 == base   # right.__eq__(left) ??

Base.__eq__(Base(), 1)

False

But how does Python know to do this? What signal is the integer sending to flip the operands? Let's see if we can pick this apart by directly invoking the __eq__ method instead of using the equality operator.

(1).__eq__(base)

NotImplemented

Interestingly, the above returns NotImplemented, and this value signals to Python that left.__eq__(right) cannot be performed and should instead attempt right.__eq__(left). This helps the comparison 1 == base be commutative and align with the expected (mathematical) behavior.

Base == Derived

Finally, let's see what happens when we compare two objects that are related to one another via inheritance:

derived == base # left.__eq__(right)

Derived.__eq__(Derived(), Base())

True

base == derived # right.__eq__(left) ??

Derived.__eq__(Derived(), Base())

True

Here, derived == base calls Derived.__eq__, which always returns True. So despite the base class returning False for equality, the derived class always returns True. This shows how the custom __eq__ implementation can affect equality, breaking the commutative property if not handled carefully.

Interestingly, Python doesn't call the __eq__ method of the object on the left-hand side! This is because the object on the right is derived from the object on the left. Therefore, we know that the derived class may have overwritten the __eq__ method and we opt to use that implementation instead.

But what happens if both sides of the comparator return NotImplemented for the given comparison?

@dataclass
class T:
    @log
    def __eq__(self, other):
        return NotImplemented

@dataclass
class G:
    @log
    def __eq__(self, other):
        return NotImplemented

T() == G() # you can see that we attempt both calls!

T.__eq__(T(), G())
G.__eq__(G(), T())

False

If neither reflection of equality returns a value, then Python simply returns False. If the two objects are unable to be compared, they must be not equal to one another.

Comparison Summary

So now we’ve seen that for the equation $a = b$ Python will do its best to maintain the mathematical commutative property but also support flexible user-defined behavior. We use NotImplemented to signal to Python that the one-sided comparison doesn’t work and to try the other side. In a logical manner, we can say that for an equality operation like a == b...

try a.__eq__(b)
If 1. returns NotImplemented, try b.__eq__(a)
If 2. also returns NotImplemented, return False.

This listing shows a very important thing: while one cannot guarantee whether left.__eq__(right) or right.__eq__(left) will be invoked if one does not control the objects on both sides of the operators, we DO know that left.__eq__(right) will always be tried first.

Real Application Problem

So what does this have to do with Modin and pandas? This entire blog post was written because of a curious bug in modin that arose because of the comparison precedence above that breaks that commutative property.

As it turns out, one needs to be extra careful when creating these equality methods especially when they can be compared against any arbitrary object that your user passes in. In this case, the guidance is to always put the object that you have written an __eq__ method on the left side of the comparator to ensure that the logic you control is executed first, only falling back to the other objects __eq__ method if you return NotImplemented.

To demonstrate where this can be problematic, you may be familiar with the idea of array broadcasting:

import pandas as pd

s = pd.Series([1, 2, 3])
s == 'a'

0    False
1    False
2    False
dtype: bool

'a' == s

0    False
1    False
2    False
dtype: bool

This works great when comparing against built-in types—and we all know why this works now. But what about all of the other objects in pandas?

What if we had some number of columns which had a data type that is a defined instance of a pandas.CategoricalDtype and wanted to identify each of those columns.

import pandas as pd

dtype = pd.CategoricalDtype(categories=['a', 'b', 'c'])
df = pd.DataFrame({
    'col1': [*'abac'],
    'col2': [*'xyxz'],
    'col3': [1,2,3,4]
}).astype({'col1': dtype, 'col2': 'category'})

display(
    df,
    df.dtypes
)

	col1	col2	col3
0	a	x	1
1	b	y	2
2	a	x	3
3	c	z	4

col1    category
col2    category
col3       int64
dtype: object

To find each of the columns that have a 'category' dtype, we can do the following:

df.dtypes == 'category'

col1     True
col2     True
col3    False
dtype: bool

This is commutative because 'category' is a string and returns NotImplemented if the type comparison fails.

'category' == df.dtypes

col1     True
col2     True
col3    False
dtype: bool

But what if we wanted to find not ANY categorical datatype, but the specific one we defined in the dtype variable?

df.dtypes == dtype

col1     True
col2    False
col3    False
dtype: bool

The above clearly works, but can we flip the operands?

dtype == df.dtypes

False

Unfortunately not. We’ve broken the commutative property here and have stumbled onto some possibly surprising behavior (at least it surprised me). This means that if you ever perform this type of comparison, then the ordering of your operands is incredibly important. And, if you have an object whose logic should be prioritized, then you will need to be careful that your object is always on the left-hand side of the == operator.

Wrap-Up: Equality in Python Isn't Always Equal

At first glance, == in Python might seem straightforward—just a simple check to see if two things are "equivalent." But as we've seen, Python's data model adds nuance to this operation. The __eq__ method allows objects to define their own rules for equality, and when left and right operands are of different types, Python tries both directions before giving up.

So, the next time you see a == b, remember: Python isn't just checking if they're equal—it’s letting a or both the a and b objects weigh in on the decision. And sometimes, what looks like a simple comparison is actually a small negotiation between two types—that don’t always agree with one another.

Have you seen other strange bugs in Python? Join the DUTC Discord server and chat with experts and network with other Python users.