top of page

Bivariate Data

Introduction to Bivariate Data

Topic Menu
Content Contributors

Learning Objectives

tutorial.png

one.png
Review: statistics concepts
Slide1.jpeg

When dealing with data we will come across both quantitative data and qualitative data.


  • Quantitative data is also known as numerical data. It refers to a quantity measurable in certain units.

  • Qualitative data is also known as categorical data. It relates to a specific quality or characteristic.

two.png
What is bivariate data?
Slide2.jpeg

Bivariate data is the study of data with two variables. Analysis of bivariate data allows us to determine whethere a relationship or association exists between two variables.


When dealing with bivariate data, we need to consider two different variables in each problem:


  1.  The response variable

  2. The explanatory variable.

two.png
What is the difference between the explanatory variable and the response variable?
Slide2.jpeg

The response variable measures the outcome of a study.


On the other hand, the explanatory variable explains or influences (but does not cause) changes in the response variable.


For example, suppose you are a student about to sit your WACE exams. 


  • Your response variable could be the grades you receive after your exams.

  • Your explanatory variable could be the amount of study you do for your exams.

two.png
Bivariate data and categorical data
Slide2.jpeg

When displaying bivariate data in a graphical format, the explanatory variable is plotted on the horizontal (y) axis and the response variable is plotted on the vertical (x) axis.

two.png
Two-way frequency tables
Slide2.jpeg

Two-way tables help us display and analyse the relationship between two different sets of categorical data.


  • The explanatory variable is at the top.

  • The response variable is down the first column. 


Consider the above example.


  • The table above shows a study of the 800 students' exam results, all of which completed the same exam.

  • 456 students studied while 344 did not. 

  • 516 students passed while 284 did not.

two.png
Row percentages tables
Slide2.jpeg

Although knowing that of those who studied, 363 passed is useful, a percentage of those who studied from each group (passed and did not pass) would be far more useful. 


We need to calculate row percentages to do this.


To do this, we divide the value in each cell by the total at the end of the row and then multiply by 100.


This is shown in the above table.

two.png
Column percentages tables
Slide2.jpeg

We can also calculate the percentage of those who studied (or did not study) that passed (or did not pass).


To find these percentages we need to divide the value in the cell by the column total and then multiply by 100.


This is shown in the above table.

two.png
Slide2.jpeg
Introduction to Bivariate Data
Finding Associations Between Variables
Correlation v Causation
Bivariate Data and Line of Best Fit
Making Predictions (Bivariate Data)
Residuals
bottom of page