# Gantt Charts in Matplotlib#

Hey everyone! Welcome to this week’s entry into Cameron’s Corner. This week, I’ve been busy teaching courses, working on some exciting TOPS updates, and helping James prep for a FREE popup seminar coming up on August 10th, “Solving Uno… the Right Way!” I can’t wait for you to see what he in store.

For today’s post, I wanted to share a fun consulting project I’m working on which involves visualizing binary signals (on/off states) across multiple devices. These types of data are often visualized using stateful lines where they rapidly increase to a value of 1 to indicate an “on” state or drop to 0 to indicate an “off” state. However, for the volume of data that we are working with, the vertical lines become nearly impossible to track because there is no ramp-up in our signal.

For our purposes, we decided to move forward with a Gantt chart, where we use a colored rectangle to indicate the “on” state and a lack of color to indicate an “off” state.

## Data Creation#

But, before we can get into the visualization, let’s create some data to play around with. In our data set, we have multiple signals we’re tracking (`'signal_id'`), and on top of that, we wanted to track multiple related—but separate—sources of those signals (`'buffer_id'`).

```from numpy.random import default_rng
from pandas import DataFrame

rng = default_rng(0)

df = DataFrame({
'signal_id': rng.choice(['A', 'B'], size=(n := 500)),
'buffer_id': rng.choice([*range(7)], size=n),
'start': (start := rng.uniform(-300, 1_000, size=n).cumsum().clip(0)),
'stop': start + rng.uniform(100, 1_500, size=n),
}).eval('delta = stop - start')

```
signal_id buffer_id start stop delta
0 B 0 0.000000 118.210743 118.210743
1 B 2 643.517529 1902.385616 1258.868088
2 B 3 1567.735531 2362.476165 794.740634
3 A 2 1608.158751 2318.444043 710.285292
4 A 4 1323.890592 2266.403194 942.512602

## First Pass Gantt Chart#

Let’s explore how we can create a Gantt chart in Matplotlib. The most direct way is to use the `Axes.broken_barh` method, which differs in behavior from the `Axes.barh`, primarily because it can draw numerous rectangles more efficiently and allows them to not be locked to the left/bottom spine (see `Artist.sticky_edges`). This is used on the `Rectangle` instances returned from `Axes.bar`.

The interface that `Axes.brokeh_barh` exposes is quite similar to `Axes.barh` except we need to specify the x, xrange, y, and yrange.

```%matplotlib inline
```
```from matplotlib.pyplot import rc

rc('figure', facecolor='white')
rc('font', size=16)
rc('axes.spines', top=False, right=False, left=False)
```
```from matplotlib.pyplot import subplots

fig, ax = subplots(figsize=(16, 4))
ax.broken_barh(xranges=df[['start', 'delta']].to_numpy(), yrange=(0, 1))
ax.margins(0)
ax.yaxis.set_tick_params(left=False, labelleft=False)
ax.xaxis.set_major_formatter(lambda x, pos: f'{x/1000:g}')
ax.set_xlabel('Elapsed Time (ms)');
```

You can see that, while this chart highlights whenever the signals are on/off, it’s missing much of the context that we’re interested in: which signal, and where did it originate?

## Juxtapose the Signals#

Let’s see if we can accomplish this with juxtaposition. By that, I mean that I’ll create two separate charts (one for each unique signal ID) and, at the same time, create a splay out each `'buffer_id'` along the y-axis to pull apart these pieces better.

```from matplotlib.pyplot import subplots, rc, get_cmap

colors =  get_cmap('Set1').colors
rc('font', size=16)

fig, axes = subplots(
df['signal_id'].nunique(), 1, figsize=(16, 8), sharey=True, sharex=True
)

for c, ax, (sig, group) in zip(colors, axes.flat, df.groupby('signal_id')):
for i, (buffer, group) in enumerate(group.groupby('buffer_id')):
ax.broken_barh(xranges=group[['start', 'delta']].to_numpy(), yrange=(i-.4, .8), facecolor=c)

ax.set_yticks(sorted(df['buffer_id'].unique()))
ax.set_yticklabels(sorted(df['buffer_id'].unique()))
ax.set_title(f'Signal {sig}', loc='left', size='large')
ax.spines['left'].set_visible(False)
ax.xaxis.set_tick_params(labelbottom=True)
ax.set_ylabel('Buffer ID')

ax.xaxis.set_major_formatter(lambda x, pos: f'{x/1000:g}')
ax.set_xlabel('Elapsed Time (ms)')

ax.margins(0)

fig.tight_layout();
```

This is looking quite nice! But, by relying on juxtaposition, we can’t easily compare “Signal A” to “Signal B” within the same `'buffer_id'`. In this case, we can use a different approach–superimposition–to better facilitate that comparison.

## Superimpose the Signals#

Creating a superimposed chart will require more care than the previous approach. In Matplotlib, we need to manually track the positions of each `PolyCollection`. We want two sets of Gantt bars for each `'buffer_id'` (one for “Signal A” and another for “Signal B”). From there, we’ll clean up some of the aesthetics and add an inline legend so that we know which bars/colors relate to which signal.

```from matplotlib.pyplot import subplots, rc, get_cmap
from matplotlib.ticker import MultipleLocator

colors =  get_cmap('Set1').colors
rc('font', size=16)

fig, ax = subplots(figsize=(16, 8))

for i, (buffer, group) in enumerate(df.groupby('buffer_id')):
for color, (sig, group) in zip(colors, group.groupby('signal_id')):
height = .3
if sig == 'A':
offset = 0
elif sig == 'B':
offset = -height

ax.broken_barh(
xranges=group[['start', 'delta']].to_numpy(), yrange=(i+offset, height),
color=color, label=sig, lw=0
)

if i == 0:
ax.annotate(
f'Signal {sig}',
xy=(1, i + offset + (height / 2)), xycoords=ax.get_yaxis_transform(),
xytext=(5, 0), textcoords='offset points',
size='large', ha='left', va='center',
color=color
)

ax.set_yticks(sorted(df['buffer_id'].unique()))
ax.set_yticklabels(sorted(df['buffer_id'].unique()))

ax.set_title(f'Visualization of Signal Overlap', loc='left', size='x-large', pad=15)
ax.set_ylabel('Buffer ID')
ax.spines['left'].set_visible(False)

ax.yaxis.set_tick_params(left=False, which='both')
ax.yaxis.set_minor_locator(MultipleLocator(.5))
ax.yaxis.grid(color=ax.get_facecolor(), which='major')
ax.yaxis.grid(which='minor')
ax.margins(0)

ax.xaxis.set_major_formatter(lambda x, pos: f'{x/1000:g}')
ax.set_xlabel('Elapsed Time (ms)')

ax.invert_yaxis();
```

And there we have it: a superimposed Gantt chart to explore our binary signals. A future addition to consider is that we have yet to pick out a message to communicate here. Although we’ve created a fairly nice exploratory chart, I would need to supplement additional visuals if I wanted to truly communicate something about how much overlap occurred between each signal.

## Wrap Up#

Thanks for checking out my blog post this week! Gantt charts are a great way to communicate a binary signal in a fairly dense format.

And, don’t forget to check out James’ FREE seminar, “Solving Uno… the Right Way!” I’ll see you there!