Commit d7aed61a authored by Toby Hodges's avatar Toby Hodges
Browse files

cleared output from cells

parent 7374e006
This diff is collapsed.
%% Cell type:markdown id: tags:
# Introduction to Python Programming
%% Cell type:markdown id: tags:
## 2. Beginning Programming
%% Cell type:markdown id: tags:
#### First Steps in Programming
%% Cell type:markdown id: tags:
So far, we’ve had fun playing with commands at the Python Shell prompt, but now we are going to need to start editing programs properly and saving them so that we can change them and re-use parts later. So, now start the Spyder program (or another text editor of your choice), and open a new file to start writing your code into. There is no prompt like in the Python Shell window, just a space for you to edit you first program. When you finish a line and press enter here, nothing will be executed. Instead, you will need to save and run your script each time you want to execute any changes that you've made. In Spyder, this is easy, as the interface includes a small Python Shell window dedicated to the output of the code that you write in the editor.
%% Cell type:markdown id: tags:
Using an editor instead of the shell allows you to quickly go back and change code that you've already written, which can make it easier to correct typos, add additional lines, and 'debug' your script to help figure out where an error or unwanted behaviour is occurring. Although you can use the command history at the shell prompt to access your previous lines of code, it is often easier to keep your scripting separate from the output. Later, we will see an example of where using an editor is really useful.
%% Cell type:markdown id: tags:
Start by entering the following code:
%% Cell type:code id: tags:
``` python
shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']
for item in shopping:
print item
```
%%%% Output: stream
bread
potatoes
eggs
flour
rubber duck
pizza
milk
%% Cell type:markdown id: tags:
This is a very simple program, which creates a variable (`shopping`) that refers to a list and then prints out each of the items in turn. There are a couple of things to comment on here. Firstly, the `for` statement creates the variable `item` (the variable name can of course be anything that you want), then sets the value to each of the elements in the list. The line that is indented is then executed for each value assigned to the `item` variable, printing out this value.
%% Cell type:markdown id: tags:
To execute the program you first need to save it. You can save the file anywhere you like on your computer (it helps if you remember where), but it is a good idea (particularly when working in Windows) to give the file an extension of ".py". This will mean that the computer will recognise it as a Python program. Once you have saved the file, you can press F5 (or choose Run->Run module from the editor window’s menu) and the output should appear in the Python shell window.
%% Cell type:markdown id: tags:
Whenever we want to execute a bit of Python code several times, a for loop is one of the ways that we can do it. Python recognises the lines we want to form part of the loop by the level of indentation and it is vital that you maintain consistent indentation throughout your programs. For example, you can choose to indent lines of code with spaces or with tabs but, whichever one you choose, you should only use one or the other for your whole program. Also, make sure that you keep the amount of indentation consistent across all the levels in your code. You will find that this approach makes your programs easier to read and understand, because you can see the structure of the program at a glance by the indentation.
%% Cell type:markdown id: tags:
#### _Exercise 2.1_
%% Cell type:markdown id: tags:
Change the program above by adding a second list (with a different variable name) to the program, which contains cheese, flour, eggs, spaghetti, sausages and bread. Change the loop so that instead of printing the element, it appends it to the old list. Then, at the end, print out the new list.
%% Cell type:markdown id: tags:
#### Making Decisions
%% Cell type:markdown id: tags:
Don’t look at this if you haven’t done the exercise above. My solution:
%% Cell type:code id: tags:
``` python
shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']
extrashopping = ['cheese', 'flour', 'eggs', 'spaghetti', 'sausages', 'bread']
for item in extrashopping:
shopping.append(item)
print(shopping)
```
%%%% Output: stream
['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk', 'cheese', 'flour', 'eggs', 'spaghetti', 'sausages', 'bread']
%% Cell type:markdown id: tags:
This looks like it’s worked exactly as I described, but maybe not quite as I intended. We seem to have too many eggs and too much bread. This might not be a problem (and it does show that the same value can be present in a list more than once), but I really just want one copy of each item. To achieve this, we need to include a check before we add an element to the list, to make sure that the value isn’t in there already. Fortunately, Python lets us do this really easily. For an example of this, go back to the Python Shell for a minute and try:
%% Cell type:code id: tags:
``` python
shopping = ['eggs', 'cheese', 'milk']
```
%% Cell type:code id: tags:
``` python
'eggs' in shopping
```
%%%% Output: execute_result
True
%% Cell type:code id: tags:
``` python
'frogs' in shopping
```
%%%% Output: execute_result
False
%% Cell type:code id: tags:
``` python
'frogs' not in shopping
```
%%%% Output: execute_result
True
%% Cell type:markdown id: tags:
We can use this in a new Python statement, which allows us to only execute statements if a particular condition is true. Back in the editor window, the program could be changed to:
%% Cell type:code id: tags:
``` python
shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']
extrashopping = ['cheese', 'flour', 'eggs', 'spaghetti', 'sausages', 'bread']
for item in extrashopping:
if item not in shopping:
shopping.append(item)
print(shopping)
```
%%%% Output: stream
['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk', 'cheese', 'spaghetti', 'sausages']
%% Cell type:markdown id: tags:
Much better. A couple of points to notice with the indentation. The `if` statement is indented with respect to the `for` statement, so it will be executed every time the loop executes. The `.append` method call is indented with respect to the `if` statement, and so it will only be executed if the condition in the `if` statement (i.e., `item not in shopping`) is true.
%% Cell type:markdown id: tags:
#### _Exercise 2.2_
%% Cell type:markdown id: tags:
Change the program above to print out a message when a duplicate item is found. To do this, you could add another `if` statement to see if the item is in the list. Alternatively, you can add an `else:` clause to the existing `if` statement. This will be executed when the condition in the `if` statement is false.
%% Cell type:markdown id: tags:
#### Counting Loops
%% Cell type:markdown id: tags:
Looping over elements of a list is great, but there are other circumstances where you just want to do something a set number of times. Most programming languages have a `for` statement which does exactly this, but Python doesn’t. Fortunately, Python has a function which generates a list of numbers for us to use in a `for` loop. Go to the Python shell and type:
%% Cell type:code id: tags:
``` python
range(10)
```
%%%% Output: execute_result
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
%% Cell type:markdown id: tags:
So, as you will see from the output, a loop like this:
%% Cell type:code id: tags:
``` python
for i in range(10):
print(i)
```
%%%% Output: stream
0
1
2
3
4
5
6
7
8
9
%% Cell type:markdown id: tags:
prints out the numbers 0 to 9 one to a line. The `range` function can provide most lists of numbers that you might need.
%% Cell type:markdown id: tags:
#### _Exercise 2.3_
%% Cell type:markdown id: tags:
Explore what you can do with the `range` function. It can take just one number as we did above, or two as starting and ending values, or even three - the start, the end and a step value. Try all three versions of the `range` command, and then work out how to produce the list: `[4, 11, 18, 25]`.
%% Cell type:markdown id: tags:
#### Direct and Indirect Loops
%% Cell type:markdown id: tags:
So, `range` can get us a list that we can use to count to any number that we want, but why does it stop short of the upper limit we give it? Why does `range(N)` mean 0..N-1 instead of 0..N or 1..N? Well, try out the following two pieces of code:
%% Cell type:code id: tags:
``` python
for item in shopping:
print(item)
```
%%%% Output: stream
bread
potatoes
eggs
flour
rubber duck
pizza
milk
cheese
spaghetti
sausages
%% Cell type:markdown id: tags:
and
%% Cell type:code id: tags:
``` python
for i in range(len(shopping)):
print(shopping[i])
```
%%%% Output: stream
bread
potatoes
eggs
flour
rubber duck
pizza
milk
cheese
spaghetti
sausages
%% Cell type:markdown id: tags:
They should be exactly the same: `range` behaves as it does so that you can use it to generate lists of indexes for sequence data types. In the blocks of code above, the first is an example of a direct loop, where you pull out the items one by one directly from the list. The second is an indirect list, where you step through the indices and use them to access the required elements from the list.
%% Cell type:markdown id: tags:
Which one is better? Generally, the direct method is slightly clearer and a bit more _Pythonesque_. However, there are circumstances where an indirect loop is the only option. If you have two lists of the same size, you might need to print out the corresponding elements of the two lists (although there might be better ways to do this, as well). In this case, you can use `range` with the size of one of the lists, and then use the index to get the corresponding elements from both.
%% Cell type:markdown id: tags:
#### _Exercise 2.4_
%% Cell type:markdown id: tags:
Start with your shopping list (or a new, shorter one to save some typing) and create a new list with the amounts you need to buy of each item. So for example:
%% Cell type:code id: tags:
``` python
shopping = ['bicycle pump', 'sofa', 'yellow paint']
amounts = ['1', '7', '9']
```
%% Cell type:markdown id: tags:
Then write a loop to step through and print the item and the amount on the same line.
%% Cell type:markdown id: tags:
#### String Formatting
%% Cell type:markdown id: tags:
When you print out pairs of values like in the exercise above, the output is a bit boring. It’s just a name and a number on a line. It could be a bit prettier, or at least more nicely formatted. You can put a few extra strings in there to make it clearer like this,
%% Cell type:markdown id: tags:
`print 'I need to buy', amounts[i], shopping[i] # Python v2`
%% Cell type:markdown id: tags:
or this,
%% Cell type:markdown id: tags:
`print(“I need to buy”, amounts[i], shopping[i]) # Python v3`
%% Cell type:code id: tags:
``` python
for i in range(len(shopping)):
print 'I need to buy', amounts[i], shopping[i]
```
%%%% Output: stream
I need to buy 1 bicycle pump
I need to buy 7 sofa
I need to buy 9 yellow paint
%% Cell type:markdown id: tags:
which is maybe a bit better. Taking this approach is ok, but it is difficult to control the formatting, particularly when you are mixing numbers and strings. Most programming languages have some function or facility for creating formatted strings and Python is no exception __[note]__.
%% Cell type:markdown id: tags:
In Python's case, formatting of strings can be taken care of by using the `.format` method or by using the `%` operator that is common amongst a lot of languages. We will use the newer `.format()` approach, but you might prefer to use `%` - I recommend that you [read this](https://pyformat.info) for a good introduction and comparison of the two approaches.
%% Cell type:markdown id: tags:
When formatting, you start with a string containing placeholders: patterns of characters that indicate the position where you want to insert the values of your variables, and their format. Then, the variables to be inserted are supplied using the `format()` method of this string. The placeholders take the form of curly brackets `{}` containing a code that tells Python what to do with the variables being inserted. For example:
%% Cell type:code id: tags:
``` python
s = 'I need to buy {} {}'.format(7, 'snakes')
s
```
%%%% Output: execute_result
'I need to buy 7 snakes'
%% Cell type:markdown id: tags:
Don't we all? In the example above, we didn't place anything inside the curly brackets, so the values of the variables provided as arguments to the `format()` method were inserted in the order and format that they were given. However, you can specify the order of insertion by including a number between the curly brackets, like so:
%% Cell type:code id: tags:
``` python
s = 'I need to buy {0} {1} because I have {0} {2}'.format(7, 'mice', 'snakes to feed')
s
```
%%%% Output: execute_result
'I need to buy 7 mice because I have 7 snakes to feed'
%% Cell type:markdown id: tags:
The placeholders can also contain information for formatting the inserted value. For example, to control the number of level of precision on a floating point number, you can use `{:.Nf}` where `N` is the number of places that you want to display.
%% Cell type:code id: tags:
``` python
mousePrice = 9.5
numberOfMice = 7
s = 'Each mouse costs EUR {:.2f} and I need {} mice, so the total cost will be EUR {:.2f}'\
.format(mousePrice, numberOfMice, mousePrice*numberOfMice)
s
```
%%%% Output: execute_result
'Each mouse costs EUR 9.50 and I need 7 mice, so the total cost will be EUR 66.50'
%% Cell type:markdown id: tags:
There are a lot of other formatting options that can be controlled by these patterns in placeholders e.g. you can automatically print large numbers split with commas, or you can print text in clearly-defined columns buffered with whitespace. For a full list and explanation, you should check out the Python documentation at https://docs.python.org/2/library/string.html#format-string-syntax.
%% Cell type:markdown id: tags:
#### _Exercise 2.5_
%% Cell type:markdown id: tags:
Change your program to print out a formatted message for each of the items in your shopping list along with the amount you need to buy of that item.
%% Cell type:markdown id: tags:
#### Looking Up Data
%% Cell type:markdown id: tags:
Keeping data in parallel arrays like this is fine if you are really really careful and you don’t need to change the arrays that much. Otherwise, it is prone to errors. One way of getting around this (and our last new data type) is to use a _dictionary_. Dictionaries are sort of like lists, but instead of holding just a single value, they hold a key-value pair. So, when you want to look up a value in the dictionary, you specify the key and the dictionary returns the value, rather than just using an index. An example might help:
%% Cell type:code id: tags:
``` python
studentNumbers = { 'Bioscience Technology': 16,
'Computational Biology': 12,
'Post-Genomic Biology': 20,
'Ecology and Environmental Management': 3,
'Maths in the Living Environment': 0
}
studentNumbers['Bioscience Technology']
```
%%%% Output: execute_result
16
%% Cell type:markdown id: tags:
The data is enclosed in curly brackets and is a comma separated list of key-value pairs. The key and value are separated by `:`. The key can be any immutable type (so, mainly strings, numbers or tuples). Notice I have split the assignment statement to create the dictionary over several lines, to make it easier to read. Normally, Python expects a command to be on a single line, but sometimes it recognises that a command isn’t finished and lets you continue on the next line. This mainly happens when you haven’t closed a set of brackets, which in the above example was deliberate, but in my case is usually because I have forgotten. Python will continue to prompt for input until you close the bracket properly before trying to execute the command.
%% Cell type:markdown id: tags:
Dictionaries themselves are a mutant datatype, so the values associated with a key can be changed:
%% Cell type:code id: tags:
``` python
studentNumbers['Bioscience Technology'] += 1 # x += 1 does the same as x = x + 1
studentNumbers['Bioscience Technology']
```
%%%% Output: execute_result
17
%% Cell type:markdown id: tags:
If you try to assign a value to a key that doesn’t exist, Python creates the entry for you automatically:
%% Cell type:code id: tags:
``` python
studentNumbers['Gardening'] = 10
studentNumbers['Gardening']
```
%%%% Output: execute_result
10
%% Cell type:markdown id: tags:
Getting rid of entries in the dictionary is easy as well, using the `del` statement:
%% Cell type:code id: tags:
``` python
del studentNumbers['Maths in the Living Environment']
studentNumbers
```
%%%% Output: execute_result
{'Bioscience Technology': 17,
'Computational Biology': 12,
'Ecology and Environmental Management': 3,
'Gardening': 10,
'Post-Genomic Biology': 20}
%% Cell type:markdown id: tags:
If we know the keys in the dictionary we can look up the values. If we want to loop over the values in the dictionary, we could create a list of the keys and loop over that, but that’s no better than keeping the keys and values in separate lists. Instead, Python can create a list of the keys for you when you need it:
%% Cell type:code id: tags:
``` python
studentNumbers.keys()
```
%%%% Output: execute_result
['Gardening',
'Computational Biology',
'Post-Genomic Biology',
'Bioscience Technology',
'Ecology and Environmental Management']
%% Cell type:markdown id: tags:
We can now put this into a `for` loop, with or without sorting it first. If we are not bothered about the order, then we can just loop directly over the dictionary:
%% Cell type:code id: tags:
``` python
for item in studentNumbers:
print item, studentNumbers[item]
```
%%%% Output: stream
Gardening 10
Computational Biology 12
Post-Genomic Biology 20
Bioscience Technology 17
Ecology and Environmental Management 3
%% Cell type:markdown id: tags:
That should work as expected. Note that Python doesn’t make any promises about the order the keys will be supplied in: they will be given the way Python thinks is best. It almost certainly won’t be either the order the keys were added to the dictionary or alphabetical order.
%% Cell type:markdown id: tags:
As well as getting the keys, you could also get the values as a list using `.values()`. Slightly more efficient is to get the key-value pairs in one step using `.items()`:
%% Cell type:code id: tags:
``` python
studentNumbers.values()
```
%%%% Output: execute_result
[10, 12, 20, 17, 3]
%% Cell type:code id: tags:
``` python
studentNumbers.items()
```
%%%% Output: execute_result
[('Gardening', 10),
('Computational Biology', 12),
('Post-Genomic Biology', 20),
('Bioscience Technology', 17),
('Ecology and Environmental Management', 3)]
%% Cell type:markdown id: tags:
Have a careful look at this output. The square brackets show that this is a list of things. But each item in that list is in fact two pieces of data in round brackets. We came across this briefly above, and it is a tuple. There are two ways we can use this in a `for` loop. Firstly, we can use a variable which will contain the tuple and unpack it in body of the loop:
%% Cell type:code id: tags:
``` python
for data in studentNumbers.items():
print data[0], data[1]
```
%%%% Output: stream
Gardening 10
Computational Biology 12
Post-Genomic Biology 20
Bioscience Technology 17
Ecology and Environmental Management 3
%% Cell type:markdown id: tags:
or (this is usually my preference) you can unpack the data directly and more explicitly in the `for` statement:
%% Cell type:code id: tags:
``` python
for course, students in studentNumbers.items():
print course, students
```
%%%% Output: stream
Gardening 10
Computational Biology 12
Post-Genomic Biology 20
Bioscience Technology 17
Ecology and Environmental Management 3
%% Cell type:markdown id: tags:
This is a little terse, so let's use the `.format()` method that was introduced earlier.
%% Cell type:code id: tags:
``` python
for course, students in studentNumbers.items():
print('Course {} has {} students'.format(course, students))
```
%%%% Output: stream
Course Gardening has 10 students
Course Computational Biology has 12 students
Course Post-Genomic Biology has 20 students
Course Bioscience Technology has 17 students
Course Ecology and Environmental Management has 3 students
%% Cell type:markdown id: tags:
The output of `.items()` is our first example of a compound data structure (in this case a list of tuples). The ability to easily construct arbitrarily complex data structures like this is one of the most powerful features of Python and one we will explore more in the next worksheet.
%% Cell type:markdown id: tags:
#### _Exercise 2.6_
%% Cell type:markdown id: tags:
Go back to your shopping list code from exercise 2.4 and change the program so that the amounts and shopping items are stored in a dictionary, then print out the items and their respective amounts by looping over the dictionary. Do it twice, once looping over the the dictionary to get the keys (or use the keys to get the values) and once by getting the key-value pairs directly from the dictionary.
%% Cell type:markdown id: tags:
#### Parcelling Up Code
%% Cell type:markdown id: tags:
Often we come across situations where we would want to do the same type of calculation several times in a single program. Many of the Python modules provide functions for doing just this (and some of you will probably have used the `math.sqrt()` function earlier). However, you can define your own functions if you want. This can be done anywhere in your program, but is conventionally done at the beginning. In any case, the important thing is that you define a function before you try to use it in your program.
%% Cell type:markdown id: tags:
As a trivial example, here is a function definition which squares a number:
%% Cell type:code id: tags:
``` python
def square(x):
return x*x
```
%% Cell type:markdown id: tags:
When Python comes across this in your program, it does nothing visible. Only afterwards when you call the function does it produce any effect. The `x` between the brackets in the `def` line is called an argument, and acts as a placeholder for whatever (in this case) you want to square. Once the function is defined, you can call it using anything in place of the `x`. For example to square the number 3, you would use:
%% Cell type:code id: tags:
``` python
square(3)
```
%%%% Output: execute_result
9
%% Cell type:markdown id: tags:
If you wanted to store the result in a variable, you could use
%% Cell type:code id: tags:
``` python
y = square(3)
```
%% Cell type:markdown id: tags: