Working With Bokeh Models#
Hey all! This week, I want to talk a bit about one of my favorite web-friendly data visualization tools: Bokeh. I’ll be delivering a FREE seminar on Bokeh on Friday, May 26th, and you won’t want to miss it! Register here!
Bokeh is a very powerful library that boasts tight coupling between Python and javascript to create interactive web-browser-based data visualizations.
While Bokeh does not have a high-level API like ploltly.express
(a similar
tool), there are many other tools that build upon Bokeh. Due to its low-level
nature, I enjoy using it as it provides me with incredible control over web-based data visualizations I want to share!
Creating Some Data#
Let’s use a simple dataset for this example—we’ll only be looking at continuous data for a scatter plot. The idea here isn’t to show you the breadth of different plots you can make with Bokeh—you can find that in the documentation—but to highlight how to work with the Bokeh object models.
from numpy import linspace
from numpy.random import default_rng
rng = default_rng(0)
x = linspace(-5, 5, 80)
y = x + rng.normal(0, 1, size=len(x))
Updating Bokeh Styles#
While there is a styles-like API for using your own themes (similar to writing your
own rcParams
in Matplotlib), I wanted to show you how to update features of
your Bokeh plots in-line. The primary entry point for most Bokeh plots will be
the high-level (yes, high-level) bokeh.plotting.figure
function. This function
returns a Bokeh figure that has been instantiated and has had many other models
attached to it for your convenience. If you instead create a bokeh.models.Plot
,
you would need to create your own x & y axis objects, title, and much
more.
The figure
function takes many arguments for us to customize our plots, but we
can also change these options post-hoc by reaching down into the underlying
models that make up the figure!
from bokeh.plotting import figure
# Create figure w/ x/y-axis, plotting space,
# and many other layouts (title, legend, layouts...)
p = figure(
title='Linear Relationship Between x & y',
width=500, height=400, toolbar_location=None
)
# Add a renderer to plot
# each renderer owns a glyph & shares a ColumnDataSource
cir_renderer = p.circle(x=x, y=y)
# change properties of each Axis
p.xaxis.major_label_text_font_size = '14pt'
p.yaxis.major_label_text_font_size = '14pt'
p.title.text_font_size = '18pt'
# renderers track data fed from a source
print(
cir_renderer.glyph, # the shape that is displayed at each coordinate pair
cir_renderer.data_source,
cir_renderer.data_source.data['x'][:3], # underlying data
sep='\n'
)
Circle(id='p1051', ...)
ColumnDataSource(id='p1048', ...)
[-5. -4.87341772 -4.74683544]
The ColumnDataSource
is a powerful feature of Bokeh as it enables us to easily
share data across multiple renderers. Renderers are in charge of knowing what glyph
s
to draw and where.
In this case, by using the figure.circle
method, we are instantiating a new renderer
that has access to the Circle
glyph it needs to draw and the .data_source
(ColumnDataSource
) to know where to draw it.
Let’s go ahead and take a look at our plot!
show(p)
Let’s update our previous plot to also include a color. In this case, I’ll use the
Set1
color palette and have the first 40 data points be red and the latter 40 be
blue. To map a color onto my glyphs, I’ll need to first update my ColumnDataSource
to contain the color mapping information, then update the field mappings
from my glyph to point to the column data source to find the color information.
Typically, this would all be done in the original call figure.circle
, but this
demonstrates the composability that the Bokeh api has for managing all of these
different models.
from numpy import repeat
from bokeh.models import ColumnDataSource
from bokeh.palettes import Set1
colors = Set1[3][:2]
# dynamically add data to an existing source
cir_renderer.data_source.add(repeat(colors, 40), 'color')
# update existing glyph to exhibit color
cir_renderer.glyph.fill_color = 'color'
cir_renderer.glyph.line_color = 'color'
show(p)
Linking Another Figure#
Let’s create another figure with slightly different y-values (though the
same x-values). Instead of creating a new ColumnDataSource
, either manually or
in the call to figure.circle
, I actually want the renderers in the original figure
and the new figure to share a source.
By sharing a ColumnDataSource
, we gain access to some powerful features like
linked brushing! So, we can select datapoints on one figure and also see that
selection on the other figure!
You can also see a few ways that we can dynamically add tools to our figures and how to layout multiple plots.
from bokeh.models import MultiLine
from bokeh.layouts import gridplot
from bokeh.models.tools import BoxSelectTool, LassoSelectTool, Toolbar
cir_renderer.data_source.add(x + rng.normal(0, 3, size=len(x)), 'z')
p.height = 200
p2 = figure(width=p.width, height=200)
p2.circle(x='x', y='z', source=cir_renderer.data_source, color='color')
# Sharing y-range is as easy as sharing the same instance
p.y_range = p2.y_range
# Uniformize tools
p.tools = p2.tools = [BoxSelectTool(), LassoSelectTool()]
# Can use selection tools (lasso/box) for linked brushing
show(
gridplot(
[[p], [p2]],
merge_tools=True,
toolbar_location='above',
)
)
Wrap-Up#
Thanks for tuning into this quick Bokeh primer and remember to attend my upcoming session “You Need to Try Bokeh” this Friday, May 26th!
Until next time!