Verified Commit 6b2df1e4 authored by Renato Alves's avatar Renato Alves 🌱
Browse files

Reorganize exercises

parent 49ba8e36
......@@ -557,6 +557,65 @@ Given `plot(X, Y)` it then takes a pair of coordinates from both variables as wi
> {: .solution }
{: .challenge }
> ## Formatting Data Points
>
> 1. Fill in the blanks in the function definition below
> so that the colour of the circles represents the continent
> that country belongs to.
>
> Note that the approach below uses
> a list comprehension to define a colour to represent each continent,
> and `*` to unpack the tuple returned
> by `get_year_data` (the function we defined in the previous exercise).
> You may wish to check back to the earlier sections on
> [comprehensions](../01-syntax/index.html#comprehensions)
> [expanding arguments outside functions](../01-syntax/index.html#argument-expansion-outside-functions)
> to remind yourself of what is happening here.
>
> ~~~
> from matplotlib import cm
>
> continents = list(gapminder['continent'].unique())
> continent_colors = [cm.Set2.colors[continents.index(c)] for ___ in gapminder[___]]
>
> fig, ax = plt.subplots()
> ax.scatter(*get_year_data(gapminder, 2002), ____, alpha=0.75)
> ax.set_title('2002')
> ax.set_xscale('log')
> ax.set_xlabel('GDP per capita (USD)')
> ax.set_ylabel('Life expectancy (years)')
> ~~~
> {: .language-python }
>
> 2. What is the `alpha` argument doing in the `ax.scatter` call above?
> Try adjusting the value to see what effect this has.
>
> > ## Solution
> >
> > ~~~
> > # 1
> > continents = list(gapminder['continent'].unique())
> > continent_colors = [cm.Set2.colors[continents.index(cont)] for cont in gapminder['continent']]
> >
> > fig, ax = plt.subplots()
> > ax.scatter(*get_year_data(gapminder, 2002), c=continent_colors, alpha=0.75)
> > ax.set_title('2002')
> > ax.set_xscale('log')
> > ax.set_xlabel('GDP per capita (USD)')
> > ax.set_ylabel('Life expectancy (years)')
> > ~~~
> > {: .language-python }
> >
> > 2: `alpha` controls the transparancy of the plotted data points.
> > A value of 1 makes the points opaque, 0 makes them invisible
> > (fully transparent).
> > For a plot like this,
> > with many overlapping points of varying size,
> > some transparency is helpful to get a complete understanding of the
> > distribution of points.
> {: .solution }
{: .challenge }
### Histograms
Another very common type of plot is the `histogram` which can be produced using
......@@ -817,7 +876,7 @@ Which looks great, but something unexpected happened in the stacked subplot.
> {: .solution}
{: .challenge}
## Others (link to docs)
## Others plots
# Axis
## Log axis
......@@ -1033,6 +1092,43 @@ it is assumed that string values like these will refer to columns inside the
> `.plot(kind='box')` or `.plot(kind='hist')`.
{: .callout }
> ## Automate Away the Repetition
>
> Whenever we see recurring patterns in our code,
> it's a sign that something could be encapsulated into its own function.
> We can then call this function every time we want to perform the same
> operation.
>
> Rearrange the lines of code below to define a function that returns
> a filtered subset of the given dataframe,
> containing only the data for the chosen year.
> (As well as re-ordering the lines,
> you will need to adjust the level of indentation of some lines.)
>
> ~~~
> return (df[f'gdpPercap_{year}'], df[f'lifeExp_{year}'], population)
> except ZeroDivisionError:
> def get_year_data(df, year, pop_scale_factor=1e6):
> population = df[f'pop_{year}']/pop_scale_factor
> raise ZeroDivisionError("Can't divide by zero. For unscaled population data, please specify pop_scale_factor=1")
> try:
> ~~~
> {: .language-python }
>
> > ## Solution
> >
> > ~~~
> > def get_year_data(df, year, pop_scale_factor=1e6):
> > try:
> > population = df[f'pop_{year}']/pop_scale_factor
> > except ZeroDivisionError:
> > raise ZeroDivisionError("Can't divide by zero. For unscaled population data, please specify pop_scale_factor=1")
> > return (df[f'gdpPercap_{year}'], df[f'lifeExp_{year}'], population)
> > ~~~
> > {: .language-python }
> {: .solution }
{: .challenge }
# Where to go from here
## The Matplotlib Gallery
......@@ -1106,101 +1202,7 @@ perhaps while showing your latest results to your colleagues.
# Exercises
> ## Automate Away the Repetition
>
> Whenever we see recurring patterns in our code,
> it's a sign that something could be encapsulated into its own function.
> We can then call this function every time we want to perform the same
> operation.
>
> Rearrange the lines of code below to define a function that returns
> a filtered subset of the given dataframe,
> containing only the data for the chosen year.
> (As well as re-ordering the lines,
> you will need to adjust the level of indentation of some lines.)
>
> ~~~
> return (df[f'gdpPercap_{year}'], df[f'lifeExp_{year}'], population)
> except ZeroDivisionError:
> def get_year_data(df, year, pop_scale_factor=1e6):
> population = df[f'pop_{year}']/pop_scale_factor
> raise ZeroDivisionError("Can't divide by zero. For unscaled population data, please specify pop_scale_factor=1")
> try:
> ~~~
> {: .language-python }
>
> > ## Solution
> >
> > ~~~
> > def get_year_data(df, year, pop_scale_factor=1e6):
> > try:
> > population = df[f'pop_{year}']/pop_scale_factor
> > except ZeroDivisionError:
> > raise ZeroDivisionError("Can't divide by zero. For unscaled population data, please specify pop_scale_factor=1")
> > return (df[f'gdpPercap_{year}'], df[f'lifeExp_{year}'], population)
> > ~~~
> > {: .language-python }
> {: .solution }
{: .challenge }
> ## Formatting Data Points
>
> 1. Fill in the blanks in the function definition below
> so that the colour of the circles represents the continent
> that country belongs to.
>
> Note that the approach below uses
> a list comprehension to define a colour to represent each continent,
> and `*` to unpack the tuple returned
> by `get_year_data` (the function we defined in the previous exercise).
> You may wish to check back to the earlier sections on
> [comprehensions](../01-syntax/index.html#comprehensions)
> [expanding arguments outside functions](../01-syntax/index.html#argument-expansion-outside-functions)
> to remind yourself of what is happening here.
>
> ~~~
> from matplotlib import cm
>
> continents = list(gapminder['continent'].unique())
> continent_colors = [cm.Set2.colors[continents.index(c)] for ___ in gapminder[___]]
>
> fig, ax = plt.subplots()
> ax.scatter(*get_year_data(gapminder, 2002), ____, alpha=0.75)
> ax.set_title('2002')
> ax.set_xscale('log')
> ax.set_xlabel('GDP per capita (USD)')
> ax.set_ylabel('Life expectancy (years)')
> ~~~
> {: .language-python }
>
> 2. What is the `alpha` argument doing in the `ax.scatter` call above?
> Try adjusting the value to see what effect this has.
>
> > ## Solution
> >
> > ~~~
> > # 1
> > continents = list(gapminder['continent'].unique())
> > continent_colors = [cm.Set2.colors[continents.index(cont)] for cont in gapminder['continent']]
> >
> > fig, ax = plt.subplots()
> > ax.scatter(*get_year_data(gapminder, 2002), c=continent_colors, alpha=0.75)
> > ax.set_title('2002')
> > ax.set_xscale('log')
> > ax.set_xlabel('GDP per capita (USD)')
> > ax.set_ylabel('Life expectancy (years)')
> > ~~~
> > {: .language-python }
> >
> > 2: `alpha` controls the transparancy of the plotted data points.
> > A value of 1 makes the points opaque, 0 makes them invisible
> > (fully transparent).
> > For a plot like this,
> > with many overlapping points of varying size,
> > some transparency is helpful to get a complete understanding of the
> > distribution of points.
> {: .solution }
{: .challenge }
> ## Adding to Plots
>
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment