Bivariate Data
Introduction to Bivariate Data
Topic Menu
Content Contributors
Learning Objectives


Review: statistics concepts

When dealing with data we will come across both quantitative data and qualitative data.
Quantitative data is also known as numerical data. It refers to a quantity measurable in certain units.
Qualitative data is also known as categorical data. It relates to a specific quality or characteristic.

What is bivariate data?
Bivariate data is the study of data with two variables. Analysis of bivariate data allows us to determine whethere a relationship or association exists between two variables.
When dealing with bivariate data, we need to consider two different variables in each problem:
The response variable
The explanatory variable.
What is the difference between the explanatory variable and the response variable?
The response variable measures the outcome of a study.
On the other hand, the explanatory variable explains or influences (but does not cause) changes in the response variable.
For example, suppose you are a student about to sit your WACE exams.
Your response variable could be the grades you receive after your exams.
Your explanatory variable could be the amount of study you do for your exams.

Bivariate data and categorical data

When displaying bivariate data in a graphical format, the explanatory variable is plotted on the horizontal (y) axis and the response variable is plotted on the vertical (x) axis.
Two-way frequency tables

Two-way tables help us display and analyse the relationship between two different sets of categorical data.
The explanatory variable is at the top.
The response variable is down the first column.
Consider the above example.
The table above shows a study of the 800 students' exam results, all of which completed the same exam.
456 students studied while 344 did not.
516 students passed while 284 did not.
Row percentages tables

Although knowing that of those who studied, 363 passed is useful, a percentage of those who studied from each group (passed and did not pass) would be far more useful.
We need to calculate row percentages to do this.
To do this, we divide the value in each cell by the total at the end of the row and then multiply by 100.
This is shown in the above table.
Column percentages tables

We can also calculate the percentage of those who studied (or did not study) that passed (or did not pass).
To find these percentages we need to divide the value in the cell by the column total and then multiply by 100.
This is shown in the above table.