Verified Commit e7faa443 authored by Renato Alves's avatar Renato Alves 🌱
Browse files

Add barplot and stacked exercise

parent a1b9139b
......@@ -657,7 +657,136 @@ plt.title(f"Normal distribution - mean={mean}, stdev={stdev}, samples={samples},
## Bar
- the next exercise assumes an example of subplots arranged in a single row or column
Before we saw histograms, a type of bar plot.
We can also produce bar plots in either vertical `plt.bar()`, or horizontal `plt.barh()` orientation.
In both cases, bars can be positioned next to each other or stacked.
In order for matplotlib to grant us flexibility when drawing the bars,
we have to handle the positioning ourselves.
This is typically achieved by dividing the maximum bar width by a fixed factor,
usually the number of groups being plotted.
In the following example we will produce 3 plots as subplots,
a vertical grouped bar plot, a horizontal grouped bar plot
and a vertical stacked plot
~~~
import numpy as np
import matplotlib.pyplot as plt
from collections import Counter
sequences = (
"GAAGTACAAAATGTCATTAATGCTATGCAGAAAATCTTAGAGTGTCCCATCTGTCTGGAGTTGATCAAGG",
"TGTAACTGAAAATCTAATTATAGGAGCATTTGTTACTGAGCCACAGATAATACAAGAGCGTCCCCTCACA",
"CAGGAAAGTATCTCGTTACTGGAAGTTAGCACTCTAGGGAAGGCAAAAACAGAACCAAATAAATGTGTGA",
)
xlabels = ('A', 'T', 'C', 'G')
width = 0.3
# np.arange() is like Python's range() but allows floats and returns a numpy array
position = np.arange(len(xlabels))
counts = []
# Here we count the number of occurrences of each nucleotide
for seq in sequences:
counter = Counter(seq)
# for convenience we convert the dictionary-like Counter() object into a list
# which is the format matplotlib expects (could have also been a numpy array)
counts.append([counter[x] for x in xlabels])
fig, axes = plt.subplots(1, 3, figsize=(10, 5))
ax1, ax2, ax3 = axes
# For the stacked bar plot we need to keep track the height of the previous bar
# starting at zero
previous_height = np.zeros(len(xlabels))
for i, count in enumerate(counts):
# Vertical bar plot
ax1.bar(position + i * width, count, width)
# Horizontal bar plot
ax2.barh(position + i * width, count, width)
# Stacked vertical bar plot - notice the bottom= attribute
ax3.bar(position + i * width, count, width, bottom=previous_height)
previous_height += count
# we can customize the X/Y labels to describe our groups of bars
# we also add the width to the position so labels are aligned with the central bar
ax1.set_title("Vertical barplot")
ax1.set_xticks(position + width)
ax1.set_xticklabels(xlabels)
# And in the horizontal plot we set the labels on the Y axis instead
ax2.set_title("Horizontal barplot")
ax2.set_yticks(position + width)
ax2.set_yticklabels(xlabels)
ax3.set_title("Vertical stacked barplot")
ax3.set_xticks(position + width)
ax3.set_xticklabels(xlabels)
~~~
{: .language-python }
Take a moment to read the code and the comments. There's a lot happening here.
Notice how we use `position + i * width` to position the bar manually.
You may have also noticed that we used `Axes` functions instead of `plt.*`.
When working with subplots, it's more convenient to use `Axes` directly.
There is a function `plt.gca()`, which stands for *get current axes*,
that can be used to access or modify attributes of a specific `Axes`
but this tends to complicate or make code harder to read.
Finally, if we execute the above code we get:
![subplot barplots vertical and horizontal](../fig/barplot-subplot.png)
Which looks great, but something unexpected happened in the stacked subplot.
> ## Fix the stairs
>
> Can you fix the issue with the *Vertical stacked barplot* subplot in the previous code?
> It should be a stacked barplot but looks more like a staircase.
>
> **Hint**: The bars are being stacked but something is pushing them off of their position.
>
> Once done with the exercise, explore the effect of modifying `width`.
> What happens when `width = 0.5` or bigger than `1`?
>
> > ## Solution
> >
> > The problem with the stacked barplot is that we are still adding the `width`
> > shift like with other barplot variants. If we modify the code to read:
> > ~~~
> > (...)
> > # We remove the "i *" part of the code in this line
> > ax3.bar(position + width, count, width, bottom=previous_height)
> > (...)
> > ~~~
> > {: .language-python}
> >
> > Alternatively we could instead remove the width attribute entirely,
> > but doing that would also require us to modify the `ax3.set_xticks()` line.
> >
> > If we don't want the plot to look *skinny* we can also increase the width
> > value.
> > ~~~
> > stack_width = 0.9
> > (...)
> > # We remove the "i *" part of the code in this line and replace width by stack_width
> > ax3.bar(position + stack_width, count, stack_width, bottom=previous_height)
> > (...)
> > ax3.set_xticks(position + stack_width)
> > ~~~
> > {: .language-python}
> > The result of this last version of the code is:
> >
> > ![subplot barplots vertical and horizontal stack plot fixed](../fig/barplot-subplot-fixed-stacked.png)
> >
> > As for when `width = 0.5` or larger, the plot gets distorted
> > because bars from different groups start overlapping.
> {: .solution}
{: .challenge}
> ## Rearranging Subplots
>
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment