Commit 237b9ca2 authored by Bernd Klaus's avatar Bernd Klaus
Browse files

finished the plotting section

parent a7508ab5
......@@ -194,7 +194,8 @@ bodyfat
head(map_dbl(bodyfat, mean))
## ----function_template, eval=FALSE---------------------------------------
## function_name <- function(arguments, options)
## function_name <- function(argument_1, , argument_2,
## optional_argument = defautl_value )
## {
## return(...)
## }
......@@ -221,13 +222,37 @@ bodyfat <- mutate(bodyfat, height_m = height * inch_to_m,
keep(pat, is_double)
map_dbl(discard(pat, is_character), mean, na.rm = TRUE)
## ----curr-conv, echo = TRUE, eval = TRUE-------------------------------
euro.calc<-function(x, currency="US") {
## currency has a default argrument "US"
if(currency=="US") return(x*1.33)
if(currency=="Pounds") return(x*0.85)
}
euro.calc(100) ## how many dollars are 100 Euros?
## ----qplot, eval=FALSE---------------------------------------------------
## qplot(x, y = NULL, ..., data, facets = NULL,
## NA), ylim = c(NA, NA), log = "", main = NULL,
## xlab = deparse(substitute(x)), ylab = deparse(substitute(y)))
## }
## ----qplot_example-------------------------------------------------------
bodyfat <- mutate(bodyfat, weight_binned = cut(weight_kg, 5))
qplot(abdomen.circum, percent.fat,
color = weight_binned, data = bodyfat)
## ----qplot_example_facets------------------------------------------------
qplot(abdomen.circum, percent.fat,
color = weight_binned, data = bodyfat, facets = ~weight_binned)
## ----embl_logo, results='hide'-------------------------------------------
load("hex_grid.Rdata")
embl_colors <- c("#E2001A", "#6FAA46")
qplot(x, y, data = hex_grid, color = lab, asp = 1) +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text = element_blank(),
line = element_blank(),
title = element_blank())
## ----if-example, echo = TRUE, eval = TRUE------------------------------
w= 3
......@@ -290,25 +315,34 @@ pat
pat <- mutate(pat, BMI = Weight / Height^2)
## ----sol-plot, echo = TRUE, eval = FALSE-------------------------------
## x <- seq(from=-2, to=2, by=0.2)
## length(x)
## x
## stand.normal <- dnorm(x, mean=0, sd=1)
## # or: stand.normal<-dnorm(x)
## length(stand.normal)
## stand.normal
##
## #visualize it
## #
## plot(x,stand.normal, type="l")
##
## plot(x,stand.normal, type="b")
## plot(x,stand.normal, type="h", col = "darkgreen")
## plot(x,stand.normal, type="h", col = "darkgreen", main = "Standard Normal Density")
##
## # use qplot
## qplot(x, stand.normal, color = I("darkgreen"))
## ----embl_logo_ex, results='hide'----------------------------------------
load("hex_grid.Rdata")
embl_colors <- c("#E2001A", "#6FAA46")
qplot(x, y, data = hex_grid, color = lab, asp = 1) +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text = element_blank(),
line = element_blank(),
title = element_blank())
## ----sol_embl_logo, results='hide'---------------------------------------
load("hex_grid.Rdata")
embl_colors <- c("#E2001A", "#6FAA46")
qplot(x, y, data = hex_grid, color = lab, asp = 1) +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text = element_blank(),
line = element_blank(),
title = element_blank()) +
scale_colour_manual(values = embl_colors )
## ----seesionInfo, results='markup'---------------------------------------
sessionInfo()
......
......@@ -679,13 +679,17 @@ You can create your own functions very easily by adhering to the following
template:
```{r function_template, eval=FALSE}
function_name <- function(arguments, options)
function_name <- function(argument_1, , argument_2,
optional_argument = defautl_value )
{
return(...)
}
```
As you can see, the source code of the function has to be in curly brackets
As you can see, the source code of the function has to be in curly brackets, while
the arguments are defined in the parantheses. Arguments without a default value
are mandatory, and default value are specified by equality signs.
By default R returns the result of the last computation performed within the
curly brackets (often, this will be the last line of the function). However,
you can always specify the return value directly with `return()`. If
......@@ -757,32 +761,29 @@ __Exercise: Handling a small data set__
# Simple plotting in R: `qlot` and `r CRANpkg("ggplot2")`
The package `r CRANpkg("ggplot2")` allows very flexible plotting in R,
but takes a while to get acquainted with the underlying grammer
of graphics. Thus, will use its function `qplot()` for "quick plotting",
which requires no knowledge of the underlying advanced features and behaves
much like R's default `plot` function.
However, it offers advanced options like facetting or coloring by condition
as well.
The package `r CRANpkg("ggplot2")`
There's a quick plotting function in \CRANpkg{ggplot2} called `qplot()`
`qplot` and its most important options
`qplot` can be used much like plot, but has some additional features
that are very useful, like facetting or coloring by condition. It represents
an easy way to get started with \CRANpkg{ggplot2}.
\begin{center}
\Rfunction{
```{r qplot, eval=FALSE}
qplot(x, y = NULL, ..., data, facets = NULL,
NA), ylim = c(NA, NA), log = "", main = NULL,
xlab = deparse(substitute(x)), ylab = deparse(substitute(y)))
}
\end{center}
```
The arguments are:
* `x:` x--axis data
* `y:` y--axis data (may be missing)
* `data:` `data.frame` containing the variables used in the plot
* `facets= ` split the plot into facets, use a formula like
. \textasciitilde split to do wrapped splitting and row \textasciitilde columns to split by rows and columns
. ~split to do wrapped splitting and row ~ columns to split by rows and columns
* `main:` plot heading
* `color, fill` set to factor/string in the data set in order to
color the plot depending on that factor. Use `I("colorname")` to use a
......@@ -792,82 +793,54 @@ include point, line, boxplot, histogram etc.
* `xlab, ylab, xlim, ylim` set the x--/y--axis parameters
As an example, we create a plot of `perc.fat` against abdomen circumference and
color it by weight. For this we bin the weight vector into 5 discrete categories
using the `cut` function.
```{r qplot_example}
bodyfat <- mutate(bodyfat, weight_binned = cut(weight_kg, 5))
### Exercise: Plotting the normal density
The density of the normal distribution with expected value $\mu$
and variance $\sigma^2$ is given by:
\[
f(x)
= \frac{1}{\sigma^2 \sqrt{\pi}} \exp \left(- \frac{1}{2} (\frac{x- \mu}{\sigma})^2 \right)
\]
In R it is implemented in the function
`dnorm`.
* Call the R help to find out how to calculate density values.
* Determine the values of the standard normal distribution density ($\mu=0$ and $\sigma^2=1$)
at $ -2, -1.8, -1.6, \ldots, +2 $ and save it in a vector `stand.normal`.
* Plot the results obtained in the last exercise, play a bit with the plotting options!
* use `qplot` to produce the same plot and change the
color of the density line using `color=I("darkgreen")`.
## Calling functions and programming
## Calling functions
Every \R--function is following the pattern below:
\begin{center}
`{function.name(arguments, optional arguments) }`
\end{center}
* `{arguments}`: Some function arguments are necessary to run the function
* `{optional arguments}`: Some function arguments can be changed,
otherwise the default values are used. They are indicated by an equals sign.
* `{?function.name}`: Getting help
* `{function.name}`: Source code
qplot(abdomen.circum, percent.fat,
color = weight_binned, data = bodyfat)
```
As an example, look at the mean command:
We can (unsurprisingly) see that abdomen circumference, weight and bodyfat are highly
correlated to each other. We can also produce a facetted plot split by weight.
\begin{center}
`mean(x, trim = 0, na.rm = FALSE)`
\end{center}
```{r qplot_example_facets}
qplot(abdomen.circum, percent.fat,
color = weight_binned, data = bodyfat, facets = ~weight_binned)
* `{x}`: Data
* `{trim = 0}`: Trimmed mean (mean of the data without $x\$
of extreme values)
* `{na.rm = FALSE}`: Remove missing values?
```
Here, `x` (usually a vector)
has to be given in order to run the function,
while the other arguments such as `trim` are optional,
i.e. if you do not change
them, their default values are used.
__Exercise: Plotting the EMBL logo__
The code below plots the embl logo. Note that the plus sign adds additional
layers to the ggplot object. This allows you to modify any given plot.
However the colors are not quite right. Can you fix that?
Check out the [ggplot2 docs](http://docs.ggplot2.org/) or try googeling!
```{r embl_logo, results='hide'}
load("hex_grid.Rdata")
embl_colors <- c("#E2001A", "#6FAA46")
qplot(x, y, data = hex_grid, color = lab, asp = 1) +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text = element_blank(),
line = element_blank(),
title = element_blank())
As example, we look at the following currency converter function
```
```{r curr-conv, echo = TRUE, eval = TRUE}
euro.calc<-function(x, currency="US") {
## currency has a default argrument "US"
if(currency=="US") return(x*1.33)
if(currency=="Pounds") return(x*0.85)
}
euro.calc(100) ## how many dollars are 100 Euros?
```
Here `x` is a formal argument, necessary to
execute the function.
`currency:` is an optional argument, set to "US" by default.
## Programming statements
## Flow control
R offers the typical options for flow--control known from many other
......@@ -1014,51 +987,47 @@ pat <- mutate(pat, BMI = Weight / Height^2)
```
__Exercise: Plotting the EMBL logo__
### Exercise: Plotting the normal density
The code below plots the embl logo. Note that the plus sign adds additional
layers to the ggplot object. This allows you to modify any given plot.
However the colors are not quite right. Can you fix that?
Check out the [ggplot2 docs](http://docs.ggplot2.org/) or try googeling!
The density of the normal distribution with expected value $\mu$
and variance $\sigma^2$ is given by:
\[
f(x)
= \frac{1}{\sigma^2 \sqrt{\pi}} \exp \left(- \frac{1}{2} (\frac{x- \mu}{\sigma})^2 \right)
\]
In R it is implemented in the function
`dnorm`.
* Call the R help to find out how to calculate density values.
* Determine the values of the standard normal distribution density ($\mu=0$ and $\sigma^2=1$)
at $ -2, -1.8, -1.6, \ldots, +2 $ and save it in a vector `stand.normal`.
* Plot the results obtained in the last exercise, play a bit with the plotting options!
* use `qplot` to produce the same plot and change the
color of the density line using `color=I("darkgreen")`.
### Solution: Plotting the normal density
```{r sol-plot, echo = TRUE, eval = FALSE}
x <- seq(from=-2, to=2, by=0.2)
length(x)
x
stand.normal <- dnorm(x, mean=0, sd=1)
# or: stand.normal<-dnorm(x)
length(stand.normal)
stand.normal
```{r embl_logo_ex, results='hide'}
#visualize it
#
plot(x,stand.normal, type="l")
load("hex_grid.Rdata")
embl_colors <- c("#E2001A", "#6FAA46")
qplot(x, y, data = hex_grid, color = lab, asp = 1) +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text = element_blank(),
line = element_blank(),
title = element_blank())
plot(x,stand.normal, type="b")
plot(x,stand.normal, type="h", col = "darkgreen")
plot(x,stand.normal, type="h", col = "darkgreen", main = "Standard Normal Density")
# use qplot
qplot(x, stand.normal, color = I("darkgreen"))
```
__Solution: Plotting the EMBL logo__
```{r sol_embl_logo, results='hide'}
load("hex_grid.Rdata")
embl_colors <- c("#E2001A", "#6FAA46")
qplot(x, y, data = hex_grid, color = lab, asp = 1) +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text = element_blank(),
line = element_blank(),
title = element_blank()) +
scale_colour_manual(values = embl_colors )
```
```{r seesionInfo, results='markup'}
......
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment