Exploring Weather with Bokeh!#

Hey everyone! This probably comes as a surprise, but I’m on another data-viz kick! This week, I wanted to share with you a way to interact with a few years of daily timeseries data.

We’ll be revisiting a fun dataset: daily temperature readings from New York City! This historical dataset has decades of data. However, for our purposes, I wanted to limit it to five years’ worth and visualize daily data (maximum and minimum temperatures) while allowing the ability to interactively to zoom in/out on any specific set of dates.

Let’s start by loading that data:

from pandas import read_parquet

df = (
    read_parquet(
        'data/NYC_weather.parquet',
        columns=['date', 'measurement', 'value', 'm_flag', 'q_flag'],
    )
    .loc[lambda df: 
         ~df['q_flag'].isin(['I', 'W', 'X'])
         & df['m_flag'].isna()
         & df['measurement'].isin(['TMAX', 'TMIN'])
    ]
    .pivot(index='date', columns='measurement', values='value')
    .eval('''
        TMAX = 9/5 * (TMAX/10) + 32
        TMIN = 9/5 * (TMIN/10) + 32
    ''')
    .rename(columns={               # units post-conversion
        'TMAX': 'temperature_max',  # farenheit
        'TMIN': 'temperature_min',  # farenheit
    })
    .rename_axis(columns=None)
    .sort_index()
).loc['1995':'2000']

df.head()
temperature_max temperature_min
date
1995-01-01 53.06 37.94
1995-01-02 46.94 26.96
1995-01-03 33.08 23.00
1995-01-04 33.98 19.94
1995-01-05 26.96 15.98

Pretty straight forward dataset, just two columns and our datetime index. Let’s get to plotting our data with Bokeh!

from bokeh.io import output_notebook
output_notebook()
Loading BokehJS ...

We’ll need a few components here:

  • Figure: core Bokeh plotting

  • Data abstraction: passing data to/from a Bokeh visualization

  • RangeTool: widget to introduce interactivity

We’ll need two figures: one that conveys the zoomed in (single year’s) worth of data and one that represents the entire five years’ worth of data (context).

Then we’ll use a RangeTool to link those two charts so that you can interact with the contextual chart and see changes on the zoomed in chart.

from bokeh.plotting import figure, ColumnDataSource
from bokeh.layouts import column
from bokeh.io import show
from bokeh.models import RangeTool, AdaptiveTicker
from pandas import to_datetime, DateOffset

cds = ColumnDataSource(df)

zoom_p = figure(
    width=500, height=250, 
    x_axis_type='datetime', 
    y_range=[0, 110],
    # show a range of 1 year by default
    x_range=[df.index.min(), df.index.min() + DateOffset(years=1, days=-1)],
    toolbar_location=None,
)

overall_p = figure(
    height=zoom_p.height // 4, width=zoom_p.width, 
    y_range=zoom_p.y_range, 
    x_range=[df.index.min(), df.index.max()],
    x_axis_type="datetime", toolbar_location=None,
)

# figures have the same data/representation- just their ranges differ
for p in [zoom_p, overall_p]:
    p.vbar(
        x='date', bottom='temperature_min', top='temperature_max',
        source=cds,
       # bokeh time unit is the millisecond, our data has gaps of 1 day
        width=24 * 60 * 60 * 900
    )

# Less ticks for the overall figure on the y-axis (since it is short)
overall_p.yaxis.ticker = AdaptiveTicker(desired_num_ticks=2, num_minor_ticks=0)
rangetool = RangeTool(x_range=zoom_p.x_range)
overall_p.add_tools(rangetool)

show(
    column(zoom_p, overall_p)
)

Not too bad at all! A little bit of Bokeh really does go a long way. Try dragging the overlaid box around and see the ranges on the zoomed in figure change.

That’s all for this week, hope you enjoyed this fun Bokeh demonstration! See you again soon!