Plotting Scatter Graphs and Understanding Correlation

To introduce plotting scatter graphs and understanding correlation I ask students to think about the relationships between different variables and to describe how they might be related.

Here’s my starter activity which students discuss in pairs then present to me on mini-whiteboards.

When the students have had time to discuss the matching pairs we talk about how each graph is likely to look if we were to plot eight typical samples.

Plotting Scatter Graphs and Understanding Correlation

Plotting Scatter Graphs and Understanding Correlation

Plotting Scatter Graphs and Understanding Correlation

Plotting Scatter Graphs and Understanding Correlation

As we begin the first example we discuss the type of relationship we expect to see when time spent reading is plotted against time spent watching TV for a sample of ten people.  The consensus is the more time people spend reading the less time they are likely to spend watching TV. I ask the class to sketch on their mini-whiteboards what the scatter graph might look like if our hypothesis is correct.

For the first couple of examples I provide the scaled axes for the class.  In later examples on plotting scatter graphs and understanding correlation I expect students to choose and draw their own axes on A4 graph paper with appropriate scaling.

Types of Correlation

When we have plotted the points, I introduce the term correlation as a means to describe the relationship between two variables.  There are two types of correlation.

  • A positive correlation means as one variable increases, or decreases, so does the other.
  • A negative correlation means as one variable increases the other will decrease.

If two variables are not related the points will be scattered so no correlation is apparent.

Line of Best Fit

A line of best fit can be used to clearly illustrate the directional trend of the data.  The closer the points are to the line of best fit the stronger the correlation.  We discuss the strength of the correlation as an indication of how closely two variables are related.

The line of best fit also helps to predict the value of one variable when the other is known. It is noted in several examiners reports by AQA and Edexcel that students are more likely to correctly estimate the value of a missing data point and identify anomalous data points if they use a line of best fit.

In my experience, there are three main misconceptions when drawing lines of best fit.

Plotting Scatter Graphs and Understanding Correlation

The line of best fit connects to the origin.

Plotting Scatter Graphs and Understanding Correlation

The line of best fit is drawn as a line segment connecting the extreme value to the origin.

Plotting Scatter Graphs and Understanding Correlation

The line of best fit passes through each of the points.

Creating Scatter Graphs from Primary Data

As an extended plenary I challenge the students to create a scatter graph based on their own hand and foot size.  Before we collect any data, I ask the students to write down their own hypothesis at the top of the A4 graph paper.

This is a fun activity which requires the group to work together so everyone has everybody else’s data.  Whenever I do this it amazes me which student steps up to take charge of organising everyone.  I do my best to keep out of the way and let the students manage the data collection so it is fair and accurate.

Correlation Versus Causation

In the next lesson we go on to discuss the limitation of using scatter graphs and correlation to identify causation.   We consider examples such as the number of ice creams sold and drownings would correlate in the summer months as increased temperatures causes people to go swimming and eat ice cream.  However, eating ice creams do not cause people to drown.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Mr Mathematics Blog

Calculating Instantaneous Rates of Change

When calculating instantaneous rates of change students need to  visualise the properties of the gradient for a straight line graph.   I use the starter activity to see if they can match four graphs with their corresponding equations. The only clue is the direction and steepness of the red lines in relation to the blue line […]

Converting Between Fractions, Decimals and Percentages

Fractions, decimals and percentages are ways of showing a proportion of something.  Any fraction can be written as a decimal, and any decimal can be written as a percentage.  In this blog I discuss how to use the place value table and equivalent fractions to illustrate how fractions, decimals and percentages are connected. You can […]

Comparing Datasets using the Mean and Range

In my experience, students, in general, find the concept of a mean straightforward to calculate and understand. However, the mean alone does not provide a complete picture of a set of data. To achieve this, a measure of spread is also required. The range is the simplest measure that can be used for this. Not […]