Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
title: Working with Data
teaching: 20
exercises: 10
questions:
- "How should I work with numeric data in Python?"
- "What's the recommended way to handle and analyse tabular data?"
- "How can I import tabular data for analysis in Python and export the results?"
objectives:
- "handle and summarise numeric data with Numpy."
- "filter values in their data based on a range of conditions."
- "load tabular data into a Pandas dataframe object."
- "describe what is meant by the data type of an array/series, and the impact this has on how the data is handled."
- "add and remove columns from a dataframe."
- "select, aggregate, and visualise data in a dataframe."
keypoints:
- "Specialised third-party libraries such as Numpy and Pandas provide powerful objects and functions that can help us analyse our data."
- "Pandas dataframe objects allow us to efficiently load and handle large tabular data."
- "Use the `pandas.read_csv` and `pandas.write_csv` functions to read and write tabular data."
---
## plan
- Toby currently scheduled to lead this session
- Numpy
- arrays
- masking
- aside about data types and potential hazards
- reading data from a file (with note that more will come later on this topic)
- link to existing image analysis material
- Pandas
- when an array just isn't enough
- DataFrames - re-use material from [Software Carpentry][swc-python-gapminder]?
- ideally with a more relevant example dataset...
- include an aside about I/O - reading/writing files (pandas, numpy, `open()`, (?) bytes vs strings, (?) encoding)
- Finish with example of `df.plot()` to set the scene for plotting section
{% include links.md %}