2_BeginningProgramming.ipynb 39 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction to Python Programming"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Beginning Programming"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### First Steps in Programming"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So far, we’ve had fun playing with commands at the Python Shell prompt, but now we are going to need to start editing programs properly and saving them so that we can change them and re-use parts later.  So, now start the Spyder program (or another text editor of your choice), and open a new file to start writing your code into. There is no prompt like in the Python Shell window, just a space for you to edit you first program. When you finish a line and press enter here, nothing will be executed. Instead, you will need to save and run your script each time you want to execute any changes that you've made. In Spyder, this is easy, as the interface includes a small Python Shell window dedicated to the output of the code that you write in the editor."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using an editor instead of the shell allows you to quickly go back and change code that you've already written, which can make it easier to correct typos, add additional lines, and 'debug' your script to help figure out where an error or unwanted behaviour is occurring. Although you can use the command history at the shell prompt to access your previous lines of code, it is often easier to keep your scripting separate from the output. Later, we will see an example of where using an editor is really useful."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Start by entering the following code:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
47
   "execution_count": null,
48 49 50
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
51
   "outputs": [],
52 53 54
   "source": [
    "shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']\n",
    "for item in shopping:\n",
55
    "    print(item)"
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is a very simple program, which creates a variable (`shopping`) that refers to a list and then prints out each of the items in turn.  There are a couple of things to comment on here.  Firstly, the `for` statement creates the variable `item` (the variable name can of course be anything that you want), then sets the value to each of the elements in the list.  The line that is indented is then executed for each value assigned to the `item` variable, printing out this value."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To execute the program you first need to save it.  You can save the file anywhere you like on your computer (it helps if you remember where), but it is a good idea (particularly when working in Windows) to give the file an extension of \".py\".  This will mean that the computer will recognise it as a Python program.  Once you have saved the file, you can press F5 (or choose Run->Run module from the editor window’s menu) and the output should appear in the Python shell window."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Whenever we want to execute a bit of Python code several times, a for loop is one of the ways that we can do it.  Python recognises the lines we want to form part of the loop by the level of indentation and it is vital that you maintain consistent indentation throughout your programs. For example, you can choose to indent lines of code with spaces or with tabs but, whichever one you choose, you should only use one or the other for your whole program. Also, make sure that you keep the amount of indentation consistent across all the levels in your code. You will find that this approach makes your programs easier to read and understand, because you can see the structure of the program at a glance by the indentation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### _Exercise 2.1_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Change the program above by adding a second list (with a different variable name) to the program, which contains cheese, flour, eggs, spaghetti, sausages and bread.  Change the loop so that instead of printing the element, it appends it to the old list.  Then, at the end, print out the new list."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Making Decisions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Don’t look at this if you haven’t done the exercise above. My solution:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
109
   "execution_count": null,
110 111 112
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
113
   "outputs": [],
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130
   "source": [
    "shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']\n",
    "extrashopping = ['cheese', 'flour', 'eggs', 'spaghetti', 'sausages', 'bread']\n",
    "for item in extrashopping:\n",
    "    shopping.append(item)\n",
    "print(shopping)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This looks like it’s worked exactly as I described, but maybe not quite as I intended.  We seem to have too many eggs and too much bread.  This might not be a problem (and it does show that the same value can be present in a list more than once), but I really just want one copy of each item.  To achieve this, we need to include a check before we add an element to the list, to make sure that the value isn’t in there already. Fortunately, Python lets us do this really easily.   For an example of this, go back to the Python Shell for a minute and try:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
131
   "execution_count": null,
132 133 134 135 136 137 138 139 140 141
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "shopping = ['eggs', 'cheese', 'milk']"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
142
   "execution_count": null,
143 144 145
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
146
   "outputs": [],
147 148 149 150 151 152
   "source": [
    "'eggs' in shopping"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
153
   "execution_count": null,
154 155 156
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
157
   "outputs": [],
158 159 160 161 162 163
   "source": [
    "'frogs' in shopping "
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
164
   "execution_count": null,
165 166 167
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
168
   "outputs": [],
169 170 171 172 173 174 175 176 177 178 179 180 181
   "source": [
    "'frogs' not in shopping"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use this in a new Python statement, which allows us to only execute statements if a particular condition is true.  Back in the editor window, the program could be changed to:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
182
   "execution_count": null,
183 184 185
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
186
   "outputs": [],
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213
   "source": [
    "shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']\n",
    "extrashopping = ['cheese', 'flour', 'eggs', 'spaghetti', 'sausages', 'bread']\n",
    "for item in extrashopping:\n",
    "    if item not in shopping:\n",
    "        shopping.append(item)\n",
    "print(shopping) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Much better.  A couple of points to notice with the indentation.  The `if` statement is indented with respect to the `for` statement, so it will be executed every time the loop executes.  The `.append` method call is indented with respect to the `if` statement, and so it will only be executed if the condition in the `if` statement (i.e., `item not in shopping`) is true."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### _Exercise 2.2_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
214 215
    "(i) Change the program above to print out a message when a duplicate item is found.  To do this, you could add another `if` statement to see if the item is in the list. Alternatively, you can add an `else:` clause to the existing `if` statement.  This will be executed when the condition in the `if` statement is false.\n",
    "\n",
216
    "(ii) The example illustrated above is not the only solution to adding items to a list, whilst checking for duplicates. From the three choices below, choose the version that would achieve the same goal:"
217 218 219
   ]
  },
  {
220
   "cell_type": "markdown",
221 222 223 224
   "metadata": {
    "collapsed": true
   },
   "source": [
225 226
    "a)\n",
    "```\n",
227 228 229
    "shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']\n",
    "extrashopping = ['cheese', 'flour', 'eggs', 'spaghetti', 'sausages', 'bread']\n",
    "for item in extrashopping:\n",
230
    "    if item not in shopping:   \n",
231
    "        print(item, \"is already in the list.\")\n",
232 233
    "    else: \n",
    "        shopping.append(item)\n",
234 235
    "print(shopping) \n",
    "```"
236 237 238
   ]
  },
  {
239
   "cell_type": "markdown",
240 241 242 243
   "metadata": {
    "collapsed": true
   },
   "source": [
244 245
    "b)\n",
    "```\n",
246 247 248 249 250 251
    "shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']\n",
    "extrashopping = ['cheese', 'flour', 'eggs', 'spaghetti', 'sausages', 'bread']\n",
    "for item in extrashopping:\n",
    "    if item in shopping:\n",
    "        shopping.append(item)\n",
    "    else: \n",
252
    "        print(item, \"is already in the list.\")\n",
253 254
    "print(shopping)\n",
    "```"
255 256 257
   ]
  },
  {
258
   "cell_type": "markdown",
259 260 261 262
   "metadata": {
    "collapsed": true
   },
   "source": [
263 264
    "c)\n",
    "```\n",
265 266 267 268
    "shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']\n",
    "extrashopping = ['cheese', 'flour', 'eggs', 'spaghetti', 'sausages', 'bread']\n",
    "for item in extrashopping:\n",
    "    if item in shopping:\n",
269
    "        print(item, \"is already in the list.\")\n",
270 271
    "    else: \n",
    "        shopping.append(item)\n",
272 273
    "print(shopping)\n",
    "```"
274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Counting Loops"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looping over elements of a list is great, but there are other circumstances where you just want to do something a set number of times.  Most programming languages have a `for` statement which does exactly this, but Python doesn’t.  Fortunately, Python has a function which generates a list of numbers for us to use in a `for` loop.  Go to the Python shell and type:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
292
   "execution_count": null,
293 294 295
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
296
   "outputs": [],
297 298 299 300 301 302 303 304
   "source": [
    "range(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
305
    "The `range()` function gives a `Range` object, which can be used to generate a list of integers. In Python 2 `range()` directly creates a list of integers instead of a `Range` generator, which is technically slightly different. In either case, a loop like this:"
306 307 308 309
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
310
   "execution_count": null,
311 312 313
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
314
   "outputs": [],
315 316 317 318 319 320 321 322 323
   "source": [
    "for i in range(10):\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343
    "prints out the numbers 0 to 9 one to a line.  You can use the `range` function to produce most lists of numbers that you might need."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Note__ At first glance, it might seem inconvenient that you don't get a list as output from the `range()` function in Python 3. The reason behind this is that the `Range` object is a much faster and more efficient way of generating values that will be looped through one-at-a-time, and this is the aim of the vast majority of calls to the `range()` function. Of course, if you do actually want to create a full list of integer values in a range, in Python 3 you can pass the use of `range()` into the explicit initialisation of a list as below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "range_list = list(range(10)) # or\n",
    "range_list = [range(10)]"
344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### _Exercise 2.3_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Explore what you can do with the `range` function.  It can take just one number as we did above, or two as starting and ending values, or even three - the start, the end and a step value.  Try all three versions of the `range` command, and then work out how to produce the list: `[4, 11, 18, 25]`. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Direct and Indirect Loops"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, `range` can get us a list that we can use to count to any number that we want, but why does it stop short of the upper limit we give it?  Why does `range(N)` mean 0..N-1 instead of 0..N or 1..N? Well, try out the following two pieces of code:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
376
   "execution_count": null,
377 378 379
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
380
   "outputs": [],
381 382 383 384 385 386 387 388 389 390 391 392 393 394
   "source": [
    "for item in shopping:\n",
    "    print(item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
395
   "execution_count": null,
396 397 398
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
399
   "outputs": [],
400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434
   "source": [
    "for i in range(len(shopping)):\n",
    "    print(shopping[i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "They should be exactly the same: `range` behaves as it does so that you can use it to generate lists of indexes for sequence data types. In the blocks of code above, the first is an example of a direct loop, where you pull out the items one by one directly from the list. The second is an indirect list, where you step through the indices and use them to access the required elements from the list. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Which one is better?  Generally, the direct method is slightly clearer and a bit more _Pythonesque_.  However, there are circumstances where an indirect loop is the only option.  If you have two lists of the same size, you might need to print out the corresponding elements of the two lists (although there might be better ways to do this, as well).  In this case, you can use `range` with the size of one of the lists, and then use the index to get the corresponding elements from both."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### _Exercise 2.4_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Start with your shopping list (or a new, shorter one to save some typing) and create a new list with the amounts you need to buy of each item.  So for example:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
435
   "execution_count": null,
436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "shopping = ['bicycle pump', 'sofa', 'yellow paint']\n",
    "amounts = ['1', '7', '9'] "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then write a loop to step through and print the item and the amount on the same line. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### String Formatting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When you print out pairs of values like in the exercise above, the output is a bit boring.  It’s just a name and a number on a line.  It could be a bit prettier, or at least more nicely formatted.  You can put a few extra strings in there to make it clearer like this,"
   ]
  },
  {
467 468 469
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
470
    "collapsed": false
471 472
   },
   "outputs": [],
473
   "source": [
474
    "print(\"I need to buy\", amounts[i], shopping[i])"
475 476 477 478 479 480
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
481
    "In Python 2.x this would be: `print 'I need to buy', amounts[i], shopping[i]`"
482 483 484 485
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
486
   "execution_count": null,
487 488 489
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
490
   "outputs": [],
491 492
   "source": [
    "for i in range(len(shopping)):\n",
493
    "    print('I need to buy', amounts[i], shopping[i])"
494 495 496 497 498 499
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
500
    "which is maybe a bit better. Taking this approach is ok, but it is difficult to control the formatting, particularly when you are mixing numbers and strings. Most programming languages have some function or facility for creating formatted strings and Python is no exception."
501 502 503 504 505 506
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561
    "In Python's case, formatting of strings can be taken care of in several different ways. \n",
    "\n",
    "1. by using the `%` operator that is common amongst a lot of languages\n",
    "2. by using the `.format` method, or\n",
    "3. (from Python v3.6 onwards) by using the `f''` syntax with variable names in placeholder.\n",
    "\n",
    "Let's compare these options. We have three variables, `name`, `date` and `job`, which we want to substitute into some text. It's possible to control exactly the formatting of values such as dates when constructing strings like this, but for simplicity here we perform simple string substitutions at each step."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```\n",
    "#variables for substitution\n",
    "name = 'Betty'\n",
    "date = '15th June 2016'\n",
    "job = 'engineer'\n",
    "\n",
    "# option 1 - using the % operator\n",
    "text = 'Hi, my name is %s and I am an %s. I have been an %s since %s.' % (name, job, job, date)\n",
    "print('1. using %')\n",
    "print(text)\n",
    "\n",
    "# option 2 - using the .format method of the string object\n",
    "text = 'Hi, my name is {0} and I am an {1}. I have been an {1} since {2}.'.format(name, job, date)\n",
    "print('2. using .format()')\n",
    "print(text)\n",
    "\n",
    "# option 3 - using f'' (v3.6 only)\n",
    "text = f'Hi, my name is {name} and I am an {job}. I have been an {job} since {date}.'\n",
    "print('3. using f\\'\\' (v3.6 only)')\n",
    "print(text)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```\n",
    "1. using %\n",
    "Hi, my name is Betty and I am an engineer. I have been an engineer since 15th June 2016.\n",
    "2. using the .format method of the string object\n",
    "Hi, my name is Betty and I am an engineer. I have been an engineer since 15th June 2016.\n",
    "3. using f'' (v3.6 only)\n",
    "Hi, my name is Betty and I am an engineer. I have been an engineer since 15th June 2016.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From now on, we will use the newer `.format()` approach, but you might prefer to use `%` (or `f''` if you are using v3.6) - I recommend that you [read this](https://pyformat.info) for a good introduction and comparison of the `.format` and `%` approaches."
562 563 564 565 566 567 568
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When formatting, you start with a string containing placeholders: patterns of characters that indicate the position where you want to insert the values of your variables, and their format. Then, the variables to be inserted are supplied using the `format()` method of this string. The placeholders take the form of curly brackets `{}` containing a code that tells Python what to do with the variables being inserted. For example:"
569 570 571 572
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
573
   "execution_count": null,
574 575 576
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
577
   "outputs": [],
578 579 580 581 582 583 584 585 586 587 588 589 590 591
   "source": [
    "s = 'I need to buy {} {}'.format(7, 'snakes')\n",
    "s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Don't we all? In the example above, we didn't place anything inside the curly brackets, so the values of the variables provided as arguments to the `format()` method were inserted in the order and format that they were given. However, you can specify the order of insertion by including a number between the curly brackets, like so:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
592
   "execution_count": null,
593 594 595
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
596
   "outputs": [],
597 598 599 600 601 602 603 604 605 606 607
   "source": [
    "s = 'I need to buy {0} {1} because I have {0} {2}'.format(7, 'mice', 'snakes to feed')\n",
    "s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
608
    "The placeholders can also contain information for formatting the inserted value. For example, to control level of precision on a floating point number, you can use `{:.Nf}` where `N` is the number of decimal places that you want to display."
609 610 611 612
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
613
   "execution_count": null,
614 615 616
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
617
   "outputs": [],
618 619 620 621 622 623 624 625 626 627 628 629
   "source": [
    "mousePrice = 9.5\n",
    "numberOfMice = 7\n",
    "s = 'Each mouse costs EUR {:.2f} and I need {} mice, so the total cost will be EUR {:.2f}'\\\n",
    ".format(mousePrice, numberOfMice, mousePrice*numberOfMice)\n",
    "s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
630
    "There are a lot of other formatting options that can be controlled by these patterns in placeholders e.g. you can automatically print large numbers split with commas, or you can print text in clearly-defined columns buffered with whitespace. For a full list and explanation, you should check out the Python documentation at https://docs.python.org/3/library/string.html#format-string-syntax."
631 632 633 634 635 636 637 638 639 640 641 642 643
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### _Exercise 2.5_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
644 645 646 647 648 649 650 651 652 653 654 655
    "In the example below, we have changed the program so it prints out a formatted message for each of the items in the shopping list along with the amount that needs to be bought of that item. Parts of the program are missing. You need to fill them in:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']\n",
656
    "amounts = ['1', '10', '12', '1', '2', '5', '1']\n",
657 658
    "--- i in range(len(---)):\n",
    "    s = 'I need to buy --- ---'.format(amounts[---], ---[i])\n",
659
    "    print(---)"
660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Looking Up Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keeping data in parallel arrays like this is fine if you are really really careful and you don’t need to change the arrays that much. Otherwise, it is prone to errors. One way of getting around this (and our last new data type) is to use a _dictionary_. Dictionaries are sort of like lists, but instead of holding just a single value, they hold a key-value pair. So, when you want to look up a value in the dictionary, you specify the key and the dictionary returns the value, rather than just using an index. An example might help: "
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
678
   "execution_count": null,
679 680 681
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
682
   "outputs": [],
683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703
   "source": [
    "studentNumbers = { 'Bioscience Technology': 16, \n",
    "                   'Computational Biology': 12,\n",
    "                   'Post-Genomic Biology': 20,\n",
    "                   'Ecology and Environmental Management': 3,\n",
    "                   'Maths in the Living Environment': 0\n",
    "                 }\n",
    "studentNumbers['Bioscience Technology']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data is enclosed in curly brackets and is a comma separated list of key-value pairs. The key and value are separated by `:`. The key can be any immutable type (so, mainly strings, numbers or tuples). Notice I have split the assignment statement to create the dictionary over several lines, to make it easier to read. Normally, Python expects a command to be on a single line, but sometimes it recognises that a command isn’t finished and lets you continue on the next line. This mainly happens when you haven’t closed a set of brackets, which in the above example was deliberate, but in my case is usually because I have forgotten. Python will continue to prompt for input until you close the bracket properly before trying to execute the command. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
704
    "Dictionaries themselves are a mutable datatype, so the values associated with a key can be changed:"
705 706 707 708
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
709
   "execution_count": null,
710 711 712
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
713
   "outputs": [],
714 715 716 717 718 719 720 721 722 723 724 725 726 727
   "source": [
    "studentNumbers['Bioscience Technology'] += 1 # x += 1 does the same as x = x + 1\n",
    "studentNumbers['Bioscience Technology']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you try to assign a value to a key that doesn’t exist, Python creates the entry for you automatically:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
728
   "execution_count": null,
729 730 731
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
732
   "outputs": [],
733 734 735 736 737 738 739 740 741 742 743 744 745 746
   "source": [
    "studentNumbers['Gardening'] = 10\n",
    "studentNumbers['Gardening']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Getting rid of entries in the dictionary is easy as well, using the `del` statement: "
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
747
   "execution_count": null,
748 749 750
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
751
   "outputs": [],
752 753 754 755 756 757 758 759 760 761 762 763 764 765
   "source": [
    "del studentNumbers['Maths in the Living Environment']\n",
    "studentNumbers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we know the keys in the dictionary we can look up the values.  If we want to loop over the values in the dictionary, we could create a list of the keys and loop over that, but that’s no better than keeping the keys and values in separate lists.  Instead, Python can create a list of the keys for you when you need it: "
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
766
   "execution_count": null,
767 768 769
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
770
   "outputs": [],
771 772 773 774 775 776 777 778
   "source": [
    "studentNumbers.keys()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
779
    "We can now put this into a `for` loop, with or without sorting it first.  If we are not bothered about the order, then we can use `for` and `in` to loop directly over the keys in the dictionary: "
780 781 782 783
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
784
   "execution_count": null,
785 786 787
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
788
   "outputs": [],
789
   "source": [
790 791
    "for key in studentNumbers:\n",
    "    print(key, studentNumbers[key])"
792 793 794 795 796 797
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
798
    "That should work as expected. Python doesn’t make any promises about the order the keys will be supplied in: they will be given the way Python thinks is best. It almost certainly won’t be either the order the keys were added to the dictionary or alphabetical order*."
799 800 801 802 803 804
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
805
    "*__Note__ The way that dictionaries are implemented in Python fundamentally changed in v3.6, resulting in them taking up ~1/2 the space and working ~2x as fast as they used to. A side effect of this is that dictionary objects in Python 3.6 remember the order that entries were created in and you should be able to access their entries in this order. Regardless, in the examples and exercises in this course, we assume that this order cannot be relied upon - we don't expect everyone to be using v3.6 or above, and anyway this is not yet considered a 'stable' feature of the language i.e. future versions of Python are not guaranteed to preserve the order of dictionaries. When writing your own code, if you want to access dictionary entries in a particular order, you should make sure to do so by providing keys in a specific order, as we will show below."
806 807 808 809 810 811 812 813 814 815 816
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As well as getting the keys, you could also get the values as a list using `.values()`. Slightly more efficient is to get the key-value pairs in one step using `.items()`:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
817
   "execution_count": null,
818 819 820
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
821
   "outputs": [],
822 823 824 825 826 827
   "source": [
    "studentNumbers.values()"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
828
   "execution_count": null,
829 830 831
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
832
   "outputs": [],
833 834 835 836 837 838 839 840 841 842 843 844 845
   "source": [
    "studentNumbers.items() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Have a careful look at this output. The square brackets show that this is a list of things. But each item in that list is in fact two pieces of data in round brackets. We came across this briefly above, and it is a tuple. There are two ways we can use this in a `for` loop. Firstly, we can use a variable which will contain the tuple and unpack it in body of the loop: "
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
846
   "execution_count": null,
847 848 849
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
850
   "outputs": [],
851 852
   "source": [
    "for data in studentNumbers.items():\n",
853
    "    print(data[0], data[1])"
854 855 856 857 858 859 860 861 862 863 864
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "or (this is usually my preference) you can unpack the data directly and more explicitly in the `for` statement:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
865
   "execution_count": null,
866 867 868
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
869
   "outputs": [],
870 871
   "source": [
    "for course, students in studentNumbers.items():\n",
872
    "    print(course, students)"
873 874 875 876 877 878 879 880 881 882 883
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is a little terse, so let's use the `.format()` method that was introduced earlier."
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
884
   "execution_count": null,
885 886 887
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
888
   "outputs": [],
889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911
   "source": [
    "for course, students in studentNumbers.items():\n",
    "    print('Course {} has {} students'.format(course, students))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The output of `.items()` is our first example of a compound data structure (in this case a list of tuples). The ability to easily construct arbitrarily complex data structures like this is one of the most powerful features of Python and one we will explore more in the next worksheet."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### _Exercise 2.6_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
912
    "Go back to your shopping list code from exercise 2.5 and change the program so that the amounts and shopping items are stored in a dictionary, then print out the items and their respective amounts by looping over the dictionary. Do it twice, once looping over the the dictionary to get the keys (or use the keys to get the values) and once by getting the key-value pairs directly from the dictionary."
913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Parcelling Up Code"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Often we come across situations where we would want to do the same type of calculation several times in a single program. Many of the Python modules provide functions for doing just this (and some of you will probably have used the `math.sqrt()` function earlier). However, you can define your own functions if you want. This can be done anywhere in your program, but is conventionally done at the beginning. In any case, the important thing is that you define a function before you try to use it in your program."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As a trivial example, here is a function definition which squares a number: "
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
938
   "execution_count": null,
939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def square(x):\n",
    "    return x*x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When Python comes across this in your program, it does nothing visible. Only afterwards when you call the function does it produce any effect. The `x` between the brackets in the `def` line is called an argument, and acts as a placeholder for whatever (in this case) you want to square. Once the function is defined, you can call it using anything in place of the `x`. For example to square the number 3, you would use:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
957
   "execution_count": null,
958 959 960
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
961
   "outputs": [],
962 963 964 965 966 967 968 969 970 971 972 973 974
   "source": [
    "square(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you wanted to store the result in a variable, you could use"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
975
   "execution_count": null,
976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "y = square(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and you could even pass a variable into the function:"
   ]
  },
  {
   "cell_type": "code",
Toby Hodges's avatar
Toby Hodges committed
993
   "execution_count": null,
994 995 996
   "metadata": {
    "collapsed": false
   },
Toby Hodges's avatar
Toby Hodges committed
997
   "outputs": [],
998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020
   "source": [
    "z = square(y)\n",
    "z"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Functions are incredibly versatile and a single function may take many arguments. They can contain more than one line of code, and can do anything that you can do in other parts of a Python program. You will see a much more complex example in Worksheet 3. Parcelling up code like this means that you don’t have to type it out every time the task is repeated in your program, and if you need to change it, it will only have to be changed once."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### _Exercise 2.7_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
1021
    "In worksheet one, exercise 1.1, you should have worked out an expression for calculating the hypotenuse of a right angled triangle given the other two sides. Now, complete function below, which should calculate the hypotenuse, and test it by calling"
1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`hypot(3,4)`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046
    "and it should return the value 5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "--- hypot(---, sideb)---\n",
1047
    "    h = (sidea**--- --- ---**2)**0.5\n",
1048
    "    return ---"
1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Summary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* `for` loops can be used to repeat a block of code for each item in a list.\n",
    "* `range()` can be used to create a list of numbers, and to repeat the loop for each of those numbers, to execute the loop a given number of times.\n",
    "* `if:` `elif:` `else:` statements can be used to choose one of a number of optional blocks of code depending on the conditions in the `if` and `elif` clauses.\n",
    "* String interpolation allows you to insert values into a string, enabling sophisticated formatting.\n",
    "* Tuples are a new data type which are like immutable lists.\n",
    "* Dictionaries are another object data type which stores key-value pairs.\n",
    "* The `.keys()`, `.values()` and `.items()` methods are used to get lists of the contents of a dictionary.\n",
    "* Functions can contain pieces of code to be used repeatedly, which only need to be debugged and changed once. "
   ]
Karin Sasaki's avatar
Karin Sasaki committed
1071 1072 1073 1074 1075
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
1076
    "#### _Debugging Exercise_"
Karin Sasaki's avatar
Karin Sasaki committed
1077 1078 1079 1080 1081 1082
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
1083
    "The code below contains some typos and errors. Read the description and then follow the code, making sure you understand what each line does/is supposed to do, and correct any mistakes you encounter. Finally run the code and check that the output is the same as that indicated below.\n",
Karin Sasaki's avatar
Karin Sasaki committed
1084
    "\n",
1085
    "__Description:__\n",
Karin Sasaki's avatar
Karin Sasaki committed
1086
    "\n",
1087
    "The function `find_within_range` takes a list of numbers as input and finds all the numbers therein that fall between `upper` (default: 0) and `lower` (default: 10). The numbers within the range should be returned as a list. These should be no duplicate entries in the output i.e. if the same number occurs twice in the input, and if its value is within the specified range, it will only appear once in the output.\n",
Karin Sasaki's avatar
Karin Sasaki committed
1088
    "\n",
1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106
    "Fix the bugs in the function definition. Some test examples are given in the subsequent cell, with expected output given in the comments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def find_within_range(list_of_numbers, lower=0, upper=10):\n",
    "    output = {}\n",
    "    for number in list_of_numbers:\n",
    "        if 0 > number <= 10:\n",
    "            if number in output_list:\n",
    "                output.append(number)\n",
    "    return output"
Karin Sasaki's avatar
Karin Sasaki committed
1107 1108 1109 1110 1111 1112
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
1113
    "collapsed": false
Karin Sasaki's avatar
Karin Sasaki committed
1114 1115 1116
   },
   "outputs": [],
   "source": [
1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140
    "print(find_within_range([-2, 14, 9, 3.14]))              # should return [9, 3.14]\n",
    "print(find_within_range([0, 5, 10, 15]))                 # should return [0, 5, 10]\n",
    "print(find_within_range([2.104, 10000, -435, 2.104]))    # should return [2.104]\n",
    "print(find_within_range([1, 2, 3, 4], lower=2, upper=6)) # should return [2, 3, 4]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "___Optional:___ If you would like to really challenge yourself, try changing the (fixed) function to __return only the smallest three numbers seen__, which still fall within the specified range. Some more examples, with comments on expected output, given below. The order of the numbers in the output list is unimportant."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print(find_within_range([-2, 14, 3, -9, 9, 6, 7]))       # should return [3, 6, 7]\n",
    "print(find_within_range([0, 6, 5, 15, 5, 6]))            # should return [0, 6, 5]\n",
    "print(find_within_range([1.2, 1.4, 7.8, 4.0, 8.3], lower=1, upper=8)) # should return [1.2, 1.4, 4.0] "
Karin Sasaki's avatar
Karin Sasaki committed
1141
   ]
1142 1143 1144 1145
  }
 ],
 "metadata": {
  "kernelspec": {
1146
   "display_name": "Python 3",
1147
   "language": "python",
1148
   "name": "python3"
1149 1150 1151 1152
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
1153
    "version": 3
1154 1155 1156 1157 1158
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
1159
   "pygments_lexer": "ipython3",
1160
   "version": "3.6.2"
1161 1162 1163 1164 1165
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}