Verified Commit e4ee747e authored by Toby Hodges's avatar Toby Hodges Committed by Renato Alves
Browse files

Fixes #15 `join`->`how` in pd.merge exercise

parent b251e639
......@@ -1523,12 +1523,12 @@ print(combined.head())
> ## 2.11. Joined in Hole-y Partnership
>
> 1. Compare the dimensions of the new `combined` dataframe with those of the original `covid_cases` dataframe. Do they match? If not, investigate `combined`, and the two dataframes from which it was created, to figure out why.
> 2. The `merge` method has a parameter, `join`, which is set to `"inner"` by default. Try out the other possible values (`"outer"`, `"left"`, and `"right"`) and make sure you understand what's going on in each case.
> 2. The `merge` method has a parameter, `how`, which is set to `"inner"` by default. Try out the other possible values (`"outer"`, `"left"`, and `"right"`) and make sure you understand what's going on in each case.
>
> > ## Solution
> >
> > 1. Using `combined.shape`, we can see that the new dataframe has many fewer rows than the original `covid_cases`. A closer inspection of `covid_lockdowns` shows us that there are a lot of countries and territories missing from that table, compared with the more comprehensive `covid_cases` list. This must be because these other countries haven't put lockdowns in place (or perhaps that we just don't have the data on those lockdowns). The data for these countries is not included in the `combinded` dataframe.
> > 2. Where `join="inner"` only keeps rows with index values that appear in both merged dataframes, merging with `join="outer"` keeps all values in either index, inserting blank values (e.g. `NaN`) where no data can be included from one of the "parent" dataframes. `join="left"` and `join="right"` keep rows for all index values in either the first or the second dataframe in the merge call, respectively. As with `"outer"`, blank values are inserted where data is missing in the other frame being merged.
> > 2. Where `how="inner"` only keeps rows with index values that appear in both merged dataframes, merging with `how="outer"` keeps all values in either index, inserting blank values (e.g. `NaN`) where no data can be included from one of the "parent" dataframes. `how="left"` and `how="right"` keep rows for all index values in either the first or the second dataframe in the merge call, respectively. As with `"outer"`, blank values are inserted where data is missing in the other frame being merged.
> {: .solution }
{: .challenge }
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment