Working With Bokeh Models
Hey all! This week, I want to talk a bit about one of my favorite web-friendly data visualization tools: Bokeh. I'll be delivering a FREE seminar on Bokeh on Friday, May 26th, and you won't want to miss it! Register here!
Bokeh is a very powerful library that boasts tight coupling between Python and javascript to create interactive web-browser-based data visualizations.
While Bokeh does not have a high-level API like ploltly.express (a similar tool), there are many other tools that build upon Bokeh. Due to its low-level nature, I enjoy using it as it provides me with incredible control over web-based data visualizations I want to share!
Creating Some Data
Let’s use a simple dataset for this example—we'll only be looking at continuous data for a scatter plot. The idea here isn't to show you the breadth of different plots you can make with Bokeh—you can find that in the documentation—but to highlight how to work with the Bokeh object models.
from numpy import linspace
from numpy.random import default_rng
rng = default_rng(0)
x = linspace(-5, 5, 80)
y = x + rng.normal(0, 1, size=len(x))Updating Bokeh Styles
While there is a styles-like API for using your own themes (similar to writing your own rcParams in Matplotlib), I wanted to show you how to update features of your Bokeh plots in-line. The primary entry point for most Bokeh plots will be the high-level (yes, high-level) bokeh.plotting.figure function. This function returns a Bokeh figure that has been instantiated and has had many other models attached to it for your convenience. If you instead create a bokeh.models.Plot, you would need to create your own x & y axis objects, title, and much more.
The figure function takes many arguments for us to customize our plots, but we can also change these options post-hoc by reaching down into the underlying models that make up the figure!
from bokeh.plotting import figure
# Create figure w/ x/y-axis, plotting space,
#   and many other layouts (title, legend, layouts...)
p = figure(
    title='Linear Relationship Between x & y',
    width=500, height=400, toolbar_location=None
)
# Add a renderer to plot
#   each renderer owns a glyph & shares a ColumnDataSource
cir_renderer = p.circle(x=x, y=y)
# change properties of each Axis
p.xaxis.major_label_text_font_size = '14pt'
p.yaxis.major_label_text_font_size = '14pt'
p.title.text_font_size = '18pt'
# renderers track data fed from a source
print(
    cir_renderer.glyph, # the shape that is displayed at each coordinate pair
     cir_renderer.data_source,
    cir_renderer.data_source.data['x'][:3], # underlying data
    sep='\n'
)Circle(id='p1051', ...)
ColumnDataSource(id='p1048', ...)
[-5.         -4.87341772 -4.74683544]The ColumnDataSource is a powerful feature of Bokeh as it enables us to easily share data across multiple renderers. Renderers are in charge of knowing what glyphs to draw and where.
In this case, by using the figure.circle method, we are instantiating a new renderer that has access to the Circle glyph it needs to draw and the .data_source (ColumnDataSource) to know where to draw it.
Let’s go ahead and take a look at our plot!
show(p)Let’s update our previous plot to also include a color. In this case, I’ll use the Set1 color palette and have the first 40 data points be red and the latter 40 be blue. To map a color onto my glyphs, I'll need to first update my ColumnDataSource to contain the color mapping information, then update the field mappings from my glyph to point to the column data source to find the color information.
Typically, this would all be done in the original call figure.circle, but this demonstrates the composability that the Bokeh api has for managing all of these different models.
from numpy import repeat
from bokeh.models import ColumnDataSource
from bokeh.palettes import Set1
colors = Set1[3][:2]
# dynamically add data to an existing source
cir_renderer.data_source.add(repeat(colors, 40), 'color')
# update existing glyph to exhibit color
cir_renderer.glyph.fill_color = 'color'
cir_renderer.glyph.line_color = 'color'
show(p)Linking Another Figure
Let’s create another figure with slightly different y-values (though the same x-values). Instead of creating a new ColumnDataSource, either manually or in the call to figure.circle, I actually want the renderers in the original figure and the new figure to share a source.
By sharing a ColumnDataSource, we gain access to some powerful features like linked brushing! So, we can select datapoints on one figure and also see that selection on the other figure!
You can also see a few ways that we can dynamically add tools to our figures and how to layout multiple plots.
from bokeh.models import MultiLine
from bokeh.layouts import gridplot
from bokeh.models.tools import BoxSelectTool, LassoSelectTool, Toolbar
cir_renderer.data_source.add(x + rng.normal(0, 3, size=len(x)), 'z')
p.height = 200
p2 = figure(width=p.width, height=200)
p2.circle(x='x', y='z', source=cir_renderer.data_source, color='color')
# Sharing y-range is as easy as sharing the same instance
p.y_range = p2.y_range
# Uniformize tools
p.tools = p2.tools = [BoxSelectTool(), LassoSelectTool()]
# Can use selection tools (lasso/box) for linked brushing
show(
    gridplot(
        [[p], [p2]],
        merge_tools=True,
        toolbar_location='above',
    )
)Wrap-Up
Thanks for tuning into this quick Bokeh primer and remember to attend my upcoming session “You Need to Try Bokeh” this Friday, May 26th!
Until next time!