Exercise4_6WalkthroughMPL.ipynb 7.02 KB
Newer Older
1 2
{
 "cells": [
3 4 5 6 7 8 9 10 11 12 13 14 15 16
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction to Python Programming"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Walkthrough: Exercise 4.6 - Plotting with `matplotlib`"
   ]
  },
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "This notebook runs briefly through a solution to the exercise at the end of Worksheet 4 of the _Introduction to Python Programming_ course workbook."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Start by importing the additional functionality that you need, to print the data structure and draw the plots."
   ]
  },
  {
   "cell_type": "code",
42
   "execution_count": null,
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [],
   "source": [
    "import pprint\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# the next line is for plotting in the IPython notebook only\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Now define the function that will draw each barchart of observations. (This function definition contains a list of colors for the bars, as tuples of RGB values, that will plot each species in a different color.)"
   ]
  },
  {
   "cell_type": "code",
71
   "execution_count": null,
72 73 74 75 76 77 78 79 80 81
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def doBarChart(heights, labels, title, rows, columns, subplot):\n",
    "    plt.subplot(rows, columns, subplot)\n",
Toby Hodges's avatar
Toby Hodges committed
82
    "    plt.bar(range(len(labels)), heights)#, color=[(0.9,0.05,0.05),(0.35,1.0,0.35),(0.05,0.05,0.9),(0.9,1.0,0.0),(0.7,0.1,0.7),(0.05,0.9,0.05),(1.0,0.5,0.5),(0.0,0.0,0.5),(0.75,0.75,0.0),(0.0,0.3,0.3),(0.0,0.5,0.0),(0.2,0.8,0.8),(0.6,0.05,0.25)])\n",
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
    "    plt.title(title)\n",
    "    plt.xlabel('Taxon')\n",
    "    plt.ylabel('Abundance')\n",
    "    plt.axis([0, len(labels), 0, 25000])\n",
    "    tickPos = []\n",
    "    for pos in range(len(labels)):\n",
    "        tickPos.append(pos+0.4)\n",
    "    plt.xticks(tickPos,labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Initialise an empty dictionary, to store the data for each site, and an empty list, to store the species names as you find them. Then, open a file object and read the data line-by-line, populating the data structures as you go."
   ]
  },
  {
   "cell_type": "code",
106
   "execution_count": null,
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "sites = {}\n",
    "taxa = []\n",
    "# make sure that you change the file path below to specify the location of the data file on your system\n",
    "datafile = open('speciesDistribution.txt', 'r')\n",
    "\n",
    "for line in datafile:\n",
    "    line = line.strip()\n",
    "    if line.startswith('Site:'):\n",
    "        tag, siteName = line.split(\" \", 1)\n",
    "        sites[siteName] = {}\n",
    "    else:\n",
    "        taxonID, count = line.split()\n",
    "        count = int(count)\n",
    "        sites[siteName][taxonID] = count\n",
    "        if taxonID not in taxa:\n",
    "            taxa.append(taxonID)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
141
    "Now that you have read all of the data from the file, you need to loop over each site again, to add zero values for each species not observed. (Exercise 4.4)"
142 143 144 145
   ]
  },
  {
   "cell_type": "code",
146
   "execution_count": null,
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "for site in sites:\n",
    "    for taxon in taxa:\n",
    "        if taxon not in sites[site]:\n",
    "            sites[site][taxon] = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
169
    "Now, you should have a dictionary, `sites`, keyed by site names, with values that are themselves dictionaries keyed by species ID. Each of these dictionaries within `sites` should have the same number of entries because you added the zero counts. To check out the overall data structure, you can use `pprint.PrettyPrinter`. (Exercise 4.5)"
170 171 172 173
   ]
  },
  {
   "cell_type": "code",
174
   "execution_count": null,
175 176 177 178 179 180
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
181
   "outputs": [],
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199
   "source": [
    "pp = pprint.PrettyPrinter(indent=4)\n",
    "pp.pprint(sites)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Looks good to me! Now, you want to be able to control the order in which the species and counts are extracted from this dictionary for plotting. To do this, you use the dictionary keys and the list of species names that were compiled when the data was read from the file. To make the plot more intuitive, you can make sure that both of these are sorted alphabetically."
   ]
  },
  {
   "cell_type": "code",
200
   "execution_count": null,
201 202 203 204 205 206
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
207
   "outputs": [],
208
   "source": [
Toby Hodges's avatar
Toby Hodges committed
209
    "taxa.sort()\n",
210
    "siteNames = sites.keys()\n",
Toby Hodges's avatar
Toby Hodges committed
211
    "siteNames = sorted(siteNames)"
212 213 214 215 216 217 218 219 220 221
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
222
    "Now, you are ready to plot the data using the function defined above. (Exercise 4.6)"
223 224 225 226
   ]
  },
  {
   "cell_type": "code",
227
   "execution_count": null,
228 229 230 231 232 233
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
234
   "outputs": [],
235 236 237 238 239 240 241 242 243
   "source": [
    "plt.figure(1, figsize=(10,10)) # you can choose your own figure size or leave\n",
    "                               # this argument out to use the default setting.\n",
    "\n",
    "subnumber = 0\n",
    "\n",
    "for site in siteNames:\n",
    "    subnumber += 1\n",
    "    barValues = []\n",
Toby Hodges's avatar
Toby Hodges committed
244 245 246
    "    for taxon in taxa:\n",
    "        barValues.append(sites[site][taxon])\n",
    "    doBarChart(barValues, taxa, site, 4, 2, subnumber)\n",
247 248 249 250 251 252 253 254
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
255
   "display_name": "Python 3",
256
   "language": "python",
257
   "name": "python3"
258 259 260 261
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
262
    "version": 3
263 264 265 266 267
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
268 269
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
270 271 272 273 274
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}