Seaborn Introduction

In [1]:
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt
In [2]:
def listAttr(obj, search = None):
    if not search:
        return [item for item in dir(obj) if not (item.startswith("_"))]
    search = search.lower()
    return [item for item in dir(obj) if not (item.startswith("_")) and search in item]
In [3]:
In [4]:
listAttr(sns, "load_dataset")

Load an example dataset from the online repository

In [5]:

Load dataset

In [6]:
tips = sns.load_dataset('tips')
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
... ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2

244 rows × 7 columns

In [7]:
In [8]:
ax = sns.scatterplot(x = 'total_bill', y = 'tip', data = tips)
In [9]:
ax = sns.barplot(x="total_bill", y="tip", data=tips)
In [10]:
ax = sns.barplot(x = "total_bill", y = "tip", data = tips)
In [11]:
ax = sns.scatterplot(x="total_bill", y="tip", hue="day", data=tips)
In [12]:
ax = sns.scatterplot(x="total_bill", y="tip", hue="day", style="time", data=tips)

to enhance a scatterplot to include a linear regression model (and its uncertainty) using lmplot():

In [13]:
sns.lmplot(x="total_bill", y="tip", data=tips)
<seaborn.axisgrid.FacetGrid at 0x18fc29ba5e0>
In [14]:
sns.lmplot(x = "total_bill", y = "tip", data = tips, hue = "time")
<seaborn.axisgrid.FacetGrid at 0x18fc2a75b20>
In [15]:
sns.lmplot(x = "total_bill", y = "tip", data = tips, hue="day")
<seaborn.axisgrid.FacetGrid at 0x18fc2a96820>

Specialized categorical plots

In [16]:
sns.catplot(x="day", y="total_bill", hue="smoker", kind="swarm", data=tips);
In [17]:
tips.query("size != 3")
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
5 25.29 4.71 Male No Sun Dinner 4
6 8.77 2.00 Male No Sun Dinner 2
... ... ... ... ... ... ... ...
237 32.83 1.17 Male Yes Sat Dinner 2
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2

206 rows × 7 columns

In [18]:
sns.catplot(x="size", y="total_bill", kind="swarm",
            data=tips.query("size != 3"));
C:\ProgramData\Miniconda3\lib\site-packages\seaborn\ UserWarning: 9.6% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
In [19]:
sns.catplot(x="day", y="total_bill", hue="smoker", kind="violin", data=tips);
In [20]:
sns.catplot(x="day", y="total_bill", hue="smoker",
            kind="bar", data=tips);
In [21]:
g = sns.catplot(x = "total_bill", y = "day",  hue="time", kind = 'box', legend=False, data = tips)
g.add_legend(title = "Meal")
<seaborn.axisgrid.FacetGrid at 0x18fc1b7fe80>
In [22]:
g = sns.catplot(x = "total_bill", y = "day",  hue="time", kind = 'box', legend=False, data = tips)
g.add_legend(title = "Meal")
g.fig.set_size_inches(10.5, 5.5)
g.set_axis_labels("Total bill ($)", "")
<seaborn.axisgrid.FacetGrid at 0x18fc1d1d1c0>
In [23]:
g = sns.catplot(x="total_bill", y="day", hue="time",
                height=3.5, aspect=1.5,
                kind="boxen", legend=False, data=tips);
In [24]:
g = sns.catplot(x="total_bill", y="day", hue="time",
                height=3.5, aspect=1.5,
                kind="box", legend=False, data=tips);
g.set_axis_labels("Total bill ($)", "")
g.set(xlim=(0, 60), yticklabels=["Thursday", "Friday", "Saturday", "Sunday"])
g.fig.set_size_inches(6.5, 3.5)[5, 15, 25, 35, 45, 55], minor=True);
plt.setp(, rotation=30);


In [25]:
C:\Users\nutan\AppData\Local\Temp\ipykernel_9328\ UserWarning: 

`distplot` is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either `displot` (a figure-level function with
similar flexibility) or `histplot` (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see

<AxesSubplot:xlabel='total_bill', ylabel='Density'>

"bin" (or "bucket") the range of values—that is, divide the entire range of values into a series of intervals and then count how many values fall into each interval.

In [26]:
sns.distplot(tips['total_bill'], bins=20, kde=False) 
C:\Users\nutan\AppData\Local\Temp\ipykernel_9328\ UserWarning: 

`distplot` is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either `displot` (a figure-level function with
similar flexibility) or `histplot` (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see

  sns.distplot(tips['total_bill'], bins=20, kde=False)
In [27]:
#kde(Kernel density estimation) - plotting the shape of a distribution
sns.distplot(tips['total_bill'], kde=False) 
C:\Users\nutan\AppData\Local\Temp\ipykernel_9328\ UserWarning: 

`distplot` is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either `displot` (a figure-level function with
similar flexibility) or `histplot` (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see

  sns.distplot(tips['total_bill'], kde=False)
In [28]:
['Dinner', 'Lunch']
Categories (2, object): ['Lunch', 'Dinner']
This particular plot shows the relationship between five variables in the tips dataset. Three are numeric, and two are categorical. Two numeric variables (total_bill and tip) determined the position of each point on the axes, and the third (size) determined the size of each point. One categorical variable split the dataset onto two different axes (facets), and the other determined the color and shape of each point.
In [29]:
sns.relplot(x="total_bill", y="tip", col="time",
            hue="smoker", style="smoker", size="size",
<seaborn.axisgrid.FacetGrid at 0x18fc3c9e0d0>
In [30]:
sns.relplot(x="total_bill", y="tip", col="time",
            hue="smoker", style="smoker", size="size", kind="line", data=tips)
<seaborn.axisgrid.FacetGrid at 0x18fc2ba33d0>

