Interactive Visualizations With Bokeh - Part 1

Interactive Visualizations With Bokeh - Part 1

In this blog, we will see some plotting example like scatter, circle, line, bar etc... with Bokeh Python library.

What is Bokeh?

Bokeh is a Python library for creating interactive visualizations for modern web browsers. It helps you build beautiful graphics, ranging from simple plots to complex dashboards with streaming datasets. With Bokeh, you can create JavaScript-powered visualizations without writing any JavaScript yourself.

Installing Bokeh

Bokeh is officially supported and tested on Python 3.7 and above (CPython). We can install Bokeh with either conda or pip:

Installing with pip

pip install bokeh

Installing with conda

conda install bokeh

Import libraries

In [1]:
from bokeh.io import output_notebook, show
from bokeh.plotting import figure

Use the figure function to create new plot objects to work with.

When using the bokeh.plotting interface, there are a few common imports:

1. Use the figure function to create new plot objects to work with.
2. Call the functions output_file or output_notebook (possibly in combination) to tell Bokeh how to display or save output.
3. Execute show and save to display or save plots and layouts.

Call output_notebook() function

In [2]:
output_notebook()
Loading BokehJS ...

We are using Jupyter notebook, so we have to call output_notebook() below. We only need to call this once, and all subsequent calls to show() will display inline in the notebook.

Plot a points

Declare data

In [3]:
x = [1, 2, 3, 4, 5]
y = [9, 7, 2, 4, 5]

Create a figure

The figure function is at the core of the bokeh.plotting interface. This function creates a Figure model that includes methods for adding different kinds of glyphs to a plot. This function also takes care of composing the various elements of your visualization, such as axes, grids, and default tools.

figure(**kwargs: Any) → bokeh.plotting.figure.Figure

Create a new Figure for plotting.

All other keyword arguments are passed to Figure.

Returns

    Figure
In [4]:
p = figure(plot_width = 400, plot_height = 400, title = "Points")
help(p.circle)

Call a glyph method such as p.circle on the figure

In [5]:
p.circle(x, y)

# show the figure
show(p)

Add some attributes to the circle.

In [6]:
p = figure(plot_width = 400, plot_height = 400, title = "Points")
p.circle(x, y, size = 15, line_color = "red", fill_color = "blue", fill_alpha = 0.5, line_width = 2)
show(p)

We can set properties directly on glyph objects. Glyph objects are found on GlyphRenderer objects, which are returned by the Plot.add_glyph and bokeh.plotting glyph methods like circle, rect, etc.

In [7]:
p = figure(plot_width = 400, plot_height = 400)

r = p.circle(x, y)

r.glyph.size = 30
r.glyph.fill_alpha = 0.2
r.glyph.fill_color = "red"
r.glyph.line_color = "firebrick"
r.glyph.line_dash = 'dashed'
r.glyph.line_width = 2

show(p)

Some more attributes of line_dash

line_dash

a line style to use

    'solid'

    'dashed'

    'dotted'

    'dotdash'

    'dashdot'

    an array of integer pixel distances that describe the on-off pattern of dashing to use

    a string of spaced integers matching the regular expression "^(\d+(\s+\d+)*)?$" that describe the on-off pattern of dashing to use
In [8]:
p = figure(plot_width = 400, plot_height = 400)

r = p.circle(x, y)

r.glyph.size = 30
r.glyph.fill_alpha = 0.2
r.glyph.line_color = "firebrick"
r.glyph.line_dash = [15, 1]
r.glyph.line_width = 2

show(p)

Plot a scatter square

In [9]:
p = figure(plot_width = 400, plot_height = 400, title = "Scatter square")

# add a square renderer with a size, color, alpha, and sizes
p.square(x, y, size = [10, 15, 20, 25, 30], color = "green", alpha = 0.6)

show(p)

Plot a cross circle

In [10]:
p = figure(plot_width=400, plot_height=400)
p.circle_cross(x, y, size = [10, 15, 20, 25, 30], line_color = "navy", fill_color = "orange", alpha = 0.6)

show(p)

Line Plot

In [11]:
p = figure(plot_width=400, plot_height=400, title="Line plot")
p.line(x, y)

show(p)

ColumnDataSource

The ColumnDataSource section describes the various ways to provide data to Bokeh, from passing data values directly to creating a ColumnDataSource.

To create a basic ColumnDataSource object, you need a Python dictionary to pass to the object’s data parameter:

I. Bokeh uses the dictionary’s keys as column names.

II. The dictionary’s values are used as the data values for your ColumnDataSource.
In [12]:
from bokeh.models import ColumnDataSource

Stacked lines

In [13]:
x = [1, 2, 3, 4, 5]
y1 = [9, 7, 2, 4, 5]
y2 = [1, 4, 2, 2, 3]

source = ColumnDataSource(data=dict(
    x = x,
    y1 = y1,
    y2 = y2,
))
p = figure(width=400, height=400)

p.vline_stack(['y1', 'y2'], x='x', source=source)

show(p)

Plots with Multiple Glyphs

In [14]:
p = figure(plot_width = 400, plot_height = 400)

# add both a line and circles on the same plot
p.line(x, y, line_width = 2)
p.circle(x, y, fill_color = "red", size = 10)

show(p)
In [15]:
p = figure(plot_width = 400, plot_height = 400)

# add both a line and circles on the same plot
p.line(x, y, line_width = 2)
p.circle(x, y, fill_color = "red", size = 20)

p.line(x, y2, line_width = 2)
p.square(x, y2, fill_color = "green", alpha = 0.6, size = 20)

show(p)

Step plot

In [16]:
p = figure(plot_width = 400, plot_height = 400)

p.step(x, y, line_color = 'red', line_width = 2)

show(p)

Bar charts

Create a plot with a categorical range, we pass the ordered list of categorical values to figure, e.g. x_range=['a', 'b', 'c']. In the plot, we passed the list of fruits as x_range, and we can see those refelected as the x-axis.

The vbar glyph method takes an x location for the center of the bar, a top and bottom (which defaults to 0), and a width. When we are using a categorical range as we are here, each category implicitly has width of 1, so setting width = 0.9 as we have done here makes the bars shrink away from each other. (Another option would be to add some padding to the range.)

In [17]:
students = ['John', 'Maria', 'Martha', 'Mary', 'Johny']
marks = [50, 30, 90, 60, 40]

p = figure(x_range = students, plot_height = 400, title = "Students Marks")

p.vbar(x = students, top = marks, width = 0.5, color = "#fc8d59")

show(p)
In [18]:
from bokeh.palettes import Spectral6

There are many palettes available with bokeh.palettes. I am using Spectral6 in this example. We can see palettes colors with dir() function.

dir(bokeh.palettes)
In [19]:
students = ['John', 'Maria', 'Martha', 'Mary', 'Johny', 'Priyanka']
marks = [50, 30, 90, 60, 40, 80]

source = ColumnDataSource(data = dict(students = students, marks = marks, color = Spectral6))

p = figure(x_range = students, plot_height = 400, title = "Students")
p.vbar(x = 'students', top = 'marks', width = 0.4, color = 'color', legend_field = "students", source = source)

p.xgrid.grid_line_color = None
p.legend.orientation = "horizontal"
p.legend.location = "top_center"

show(p)

Grouped Bar Charts

In [20]:
from bokeh.models import FactorRange
In [21]:
students = ['John', 'Maria', 'Martha', 'Mary', 'Johny', 'Priyanka']
subjects = ['Maths', 'Physics', 'Chemistry', 'Language']

data = {'students' : students,
        'Maths'   : [55, 35, 85, 99, 45, 65],
        'Physics'   : [65, 45, 80, 76, 65, 90],
        'Chemistry'   : [50, 65, 90, 85, 70, 85],
        'Language'  : [87, 78, 85, 65, 60, 76]
       }

x = [ (student, subject) for student in students for subject in subjects ]
print("X: ", x)

counts = sum(zip(data['Maths'], data['Physics'], data['Chemistry'], data['Language']), ()) # like an hstack
print("Counts: ", counts)

source = ColumnDataSource(data = dict(x = x, counts = counts))

p = figure(x_range = FactorRange(*x), plot_height = 250, title = "Students Markes by Subject")

p.vbar(x = 'x', top = 'counts', width = 0.9, source = source)

p.y_range.start = 0
p.x_range.range_padding = 0.1
p.xaxis.major_label_orientation = 1
p.xgrid.grid_line_color = None

show(p)
X:  [('John', 'Maths'), ('John', 'Physics'), ('John', 'Chemistry'), ('John', 'Language'), ('Maria', 'Maths'), ('Maria', 'Physics'), ('Maria', 'Chemistry'), ('Maria', 'Language'), ('Martha', 'Maths'), ('Martha', 'Physics'), ('Martha', 'Chemistry'), ('Martha', 'Language'), ('Mary', 'Maths'), ('Mary', 'Physics'), ('Mary', 'Chemistry'), ('Mary', 'Language'), ('Johny', 'Maths'), ('Johny', 'Physics'), ('Johny', 'Chemistry'), ('Johny', 'Language'), ('Priyanka', 'Maths'), ('Priyanka', 'Physics'), ('Priyanka', 'Chemistry'), ('Priyanka', 'Language')]
Counts:  (55, 65, 50, 87, 35, 45, 65, 78, 85, 80, 90, 85, 99, 76, 85, 65, 45, 65, 70, 60, 65, 90, 85, 76)

Creating layouts

In [22]:
from bokeh.layouts import column

Column layout

To display plots or widgets vertically, use the column() function.

In [23]:
x = [1, 2, 3, 4, 5]
y1 = [9, 7, 2, 4, 5]
y2 = [1, 4, 2, 2, 3]
In [24]:
p1 = figure(plot_width = 250, plot_height = 250)
p1.circle(x, y1, fill_color = "green", size = 20)

p2 = figure(plot_width = 250, plot_height = 250)
p2.line(x, y2, line_width = 2)

show(column(p1, p2))

Row layout

In [25]:
from bokeh.layouts import row

To display plots or widgets horizontally, use the row() function.

In [26]:
p1 = figure(plot_width = 250, plot_height = 250)
p1.circle(x, y1, fill_color = "red", size = 20)

p2 = figure(plot_width = 250, plot_height = 250)
p2.square(x, y2, size = 20, color = "green")

show(row(p1, p2))

Grid layout for plots

In [27]:
from bokeh.layouts import gridplot

Use the gridplot() function to arrange Bokeh plots in a grid. This function also merges all plot tools into a single toolbar. Each plot in the grid then has the same active tool.

You can leave grid cells blank by passing None to them instead of a plot object.

In [28]:
x2 = [7, 8, 1, 5, 9]
y3 = [4, 6, 7, 1, 5]
In [29]:
p1 = figure()
p1.circle(x, y1, fill_color = "red", size = 20)

p2 = figure()
p2.square(x, y2, size = 20, color = "green")

p3 = figure()
p3.line(x, y3)

p4 = figure()
p4.circle_cross(x2, y1, size = 20, color = "orange")

grid = gridplot([[p1, p2], [p3, p4]], plot_width = 250, plot_height = 250)
show(grid)
In [30]:
p1 = figure()
p1.circle(x, y1, fill_color = "red", size = 20)

p2 = figure()
p2.square(x, y2, size = 20, color = "green")

p3 = figure()
p3.line(x, y3)

p4 = figure()
p4.circle_cross(x2, y1, size = 20, color = "orange")

grid = gridplot([[p1, p2], [None, p4]], plot_width = 250, plot_height = 250)
show(grid)

Plot Bokeh sample data

In [31]:
from bokeh.sampledata.iris import flowers
from bokeh.transform import factor_cmap, factor_mark
In [32]:
SPECIES = ['setosa', 'versicolor', 'virginica']
MARKERS = ['hex', 'circle_x', 'triangle']

p = figure(title = "Iris Morphology")
p.xaxis.axis_label = 'Petal Length'
p.yaxis.axis_label = 'Sepal Width'

p.scatter("petal_length", "sepal_width", source = flowers, legend_field = "species", fill_alpha = 0.4, size = 12,
          marker = factor_mark('species', MARKERS, SPECIES),
          color = factor_cmap('species', 'Category10_3', SPECIES))

show(p)

Creating with Pandas DataFrames

In [33]:
source = ColumnDataSource(flowers)

p = figure(plot_width = 600, plot_height = 600)
p.scatter("petal_length", "sepal_width", source = source, legend_field = "species", fill_alpha = 0.4, size = 12,
          marker = factor_mark('species', MARKERS, SPECIES),
          color = factor_cmap('species', 'Category10_3', SPECIES))

show(p)

Filtering data

In [34]:
from bokeh.models import ColumnDataSource, CDSView, IndexFilter, GroupFilter
from bokeh.layouts import gridplot

IndexFilter

The IndexFilter is the simplest filter type. It has an indices property, which is a list of integers that are the indices of the data you want to include in your plot.

In [35]:
source = ColumnDataSource(data=dict(x = [1, 2, 3, 4, 5], y = [1, 2, 3, 4, 5]))
view = CDSView(source = source, filters = [IndexFilter([0, 2, 4])])

tools = ["box_select", "hover", "reset"]
p = figure(height = 300, width = 300, tools = tools)
p.square(x = "x", y = "y", size = 10, hover_color = "red", source = source)

p_filtered = figure(height = 300, width = 300, tools = tools)
p_filtered.square(x = "x", y = "y", size = 10, hover_color = "red", source = source, view = view)

show(gridplot([[p, p_filtered]]))

GroupFilter

The GroupFilter is a filter for categorical data. With this filter, you can select rows from a dataset that are members of a specific category.

The GroupFilter has two properties:

column_name: the name of the column in the ColumnDataSource to apply the filter to

group: the name of the category to select for
In [36]:
source = ColumnDataSource(flowers)
view1 = CDSView(source=source, filters=[GroupFilter(column_name='species', group='versicolor')])

plot_size_and_tools = {'height': 300, 'width': 300,
                        'tools':['box_select', 'reset', 'help']}

p1 = figure(title="Full data set", **plot_size_and_tools)
p1.circle(x='petal_length', y='petal_width', source=source, color='black')

p2 = figure(title="Setosa only", x_range=p1.x_range, y_range=p1.y_range, **plot_size_and_tools)
p2.circle(x='petal_length', y='petal_width', source=source, view=view1, color='red')

show(gridplot([[p1, p2]]))

Plot Network Graph

The easiest way to plot network graphs with Bokeh is to use the from_networkx function. This function accepts any NetworkX graph and returns a Bokeh GraphRenderer that can be added to a plot. The GraphRenderer has node_renderer and edge_renderer properties that contain the Bokeh renderers that draw the nodes and edges, respectively.

The example below shows a Bokeh plot of nx.desargues_graph(), setting some of the node and edge properties.

In [37]:
import networkx as nx
from bokeh.models import Range1d, Plot
from bokeh.plotting import from_networkx
In [38]:
G = nx.desargues_graph()    # always 20 nodes

# We could use figure here but don't want all the axes and titles
plot = Plot(x_range=Range1d(-2, 2), y_range=Range1d(-2, 2))

# Create a Bokeh graph from the NetworkX input using nx.spring_layout
graph = from_networkx(G, nx.spring_layout, scale=1.8, center=(0,0))
plot.renderers.append(graph)

# Set some of the default node glyph (Circle) properties
graph.node_renderer.glyph.update(size=20, fill_color="orange")

# Set some edge properties too
graph.edge_renderer.glyph.line_dash = [2,2]

show(plot)
In [ ]:
 

Machine Learning

  1. Deal Banking Marketing Campaign Dataset With Machine Learning

TensorFlow

  1. Difference Between Scalar, Vector, Matrix and Tensor
  2. TensorFlow Deep Learning Model With IRIS Dataset
  3. Sequence to Sequence Learning With Neural Networks To Perform Number Addition
  4. Image Classification Model MobileNet V2 from TensorFlow Hub
  5. Step by Step Intent Recognition With BERT
  6. Sentiment Analysis for Hotel Reviews With NLTK and Keras
  7. Simple Sequence Prediction With LSTM
  8. Image Classification With ResNet50 Model
  9. Predict Amazon Inc Stock Price with Machine Learning
  10. Predict Diabetes With Machine Learning Algorithms
  11. TensorFlow Build Custom Convolutional Neural Network With MNIST Dataset
  12. Deal Banking Marketing Campaign Dataset With Machine Learning

PySpark

  1. How to Parallelize and Distribute Collection in PySpark
  2. Role of StringIndexer and Pipelines in PySpark ML Feature - Part 1
  3. Role of OneHotEncoder and Pipelines in PySpark ML Feature - Part 2
  4. Feature Transformer VectorAssembler in PySpark ML Feature - Part 3
  5. Logistic Regression in PySpark (ML Feature) with Breast Cancer Data Set

PyTorch

  1. Build the Neural Network with PyTorch
  2. Image Classification with PyTorch
  3. Twitter Sentiment Classification In PyTorch
  4. Training an Image Classifier in Pytorch

Natural Language Processing

  1. Spelling Correction Of The Text Data In Natural Language Processing
  2. Handling Text For Machine Learning
  3. Extracting Text From PDF File in Python Using PyPDF2
  4. How to Collect Data Using Twitter API V2 For Natural Language Processing
  5. Converting Text to Features in Natural Language Processing
  6. Extract A Noun Phrase For A Sentence In Natural Language Processing