Remember, before you can use the tidyverse, you need to load the package.
library(tidyverse)
First Steps
More plots with the mpg dataset
(Taken form R4DS)
- Run
ggplot(data = mpg). What do you see?
- How many rows are in mpg? How many columns?
- What does the
drv variable describe? You may want to use ?mpg to find out
- Make a scatter plot of
hwy vs cyl
- What happens if you make a scatter plot of
class vs drv? Why is the plot not useful?
Plots with other datasets
- Take a look at the
iris dataset. What are its dimensions? What do its columns represent?
- What are the ranges of each of the numeric columns. The
summary() function might help you here
- Make a plot of sepal width vs sepal length, and set all of the points to be green
- Repeat the previous plot but colour each point by the species of the flower
Remaking plots
- Take a look at the first and last few rows of the
mtcars dataset
- Access the
cyl column of the dataset. Is this variable categorical, discrete, or continuous?
- What steps would you go about to remake the following plot?

More Aesthetics
Size, Transparency, and Shape
- Using the
mpg dataset, make a plot of city mileage vs highway mileage where the size of each point is determined by engine size (displ)
- Plot sepal length vs sepal width using the
iris dataset and control the transparency (alpha) of each point using the species variable.
- Have a play with the
Orange data set (note the capital ‘O’). Make a scatter plot of circumference against age where the shape of each point is determined by which tree the observation belongs to
- Remake the standard
hwy vs displ plot using the mpg data set but make all of the points hollow diamonds. How about solid triangles?
Choosing Appropriate Aesthetics
(Q1/2 form R4DS)
- Which variables in
mpg are categorical? Which are continuous/discrete? (The data set help file may be of use)
- Map a continuous variable to colour, size, and shape. How does this differ from when you map a categorical variable?
- Plot the standard
hwy vs displ graph using mpg and map the variable class to the size aesthetic. Was this a good idea?
- Have a discussion with a partner or think for yourself: Which of the aesthetics you know are the clearest for displaying categorical data and which are best for continuous?
- In your own opinion, order the following aesthetics by how clear they are in representing a continuous variable: size, colour, transparency
Common Problems
- Using the
mpg data set, make a plot of city milage against engine size. Map the variable class to the aesthetic shape. Is everything as you would expect?
- Type the following code into the console. Why do you recieve an error message?
ggplot(iris) +
geom_point(x = Sepal.Length, y = Petal.Length)
- Take a look at the
airquality dataset. Type the following code into the console. Is the plot as you expected?
ggplot(airquality) +
geom_point(aes(x = Wind, y = Temp, col = Month))
- Why are the points in this plot not blue?
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy, colour = 'blue'))

- What happens when you map a variable to multiple aesthetics (say colour and size)? (It’s okay to answer, “nothing”, to this question but make sure you verify that first!)
Facetting
Basic Faceting
(Taken form R4DS)
- What happens when you facet a continuous variable?
- What do the empty cells in a plot with
facet_grid(drv ~ cyl) mean? How do they relate to this plot?
ggplot(mpg) +
geom_point(aes(x = drv, y = cyl))
- What plots does the following code make? What does the
. do?
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy)) +
facet_grid(drv ~ .)
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy)) +
facet_grid(. ~ cyl)
- Take the first faceted plot from the presentation:
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
What are the advantages of faceting instead of the colour aesthetic? What are the disadvantages? How might the balance change if you had a larger data set?
- Read
?facet_wrap. What does nrow do? What does ncol do? What other options control the layout of the individual panels? Why doesn’t facet_grid() have nrow and ncol parameters?
- When using
facet_grid() you should usually put the variable with more unique levels in the columns. Why?
Combining Facets with Aesthetics
- Create a scatter plot of petal length vs petal width using the
iris dataset and facet by species
- Repeat the above plot whilst also colouring the species. Don’t forget to hide the colour legend
- Using the
mpg dataset, plot hwy vs cty, map displ to the size aesthetic, map class to point colour, and facet columns by cyl and rows by drv. This plot is ridiculous but it does demonstrate the flexibilty of ggplot2
Going Beyond
Labelling
- Run the following code. What does the extra
labs(...) layer do?
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy, colour = class)) +
labs(x = "Engine Displacment (litres)", y = "Highway Milage (miles/gallon)",
colour = "Car Type",
title = "A scatter plot of engine displacment vs highway milage",
subtitle = "Coloured by car type",
caption = "Source: EPA (http://fueleconomy.gov)")
- Use this to take the plot from the ‘Remaking plots’ section and beautify it
- Pick any plot of your choosing an give it appropriate axis labels, a title, and - if possible - a data source
Diamonds and Overplotting
- Have a look at the
diamonds dataset
- Make a scatter plot of
price against caret (this may take a long time to run). Is this plot easy to read?
- How could you fix this problem? (perhaps you could manually set a certain aesthetic)
Explanatory and Response variables
- How do you decide which variable to map to the x-axis and which to plot to the y-axis?
- If you are unsure, web-search for the phrase “explanatory and response variables”
Positional Arguments
Begin with the following code
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy, colour = factor(class)))

- Try removing
x = and y = from your geom_point call. Does everything still work?
- Try removing
colour = from your geom_point call. Does everything still work?
- Take the original plot and specify the aesthetics in a different order, say
y then colour then x. Does everything still work?
