Curriculum
Course: Pandas
Login
Text lesson

Scatter Plot

To create a scatter plot, specify the kind argument as 'scatter'.

A scatter plot requires both an x-axis and a y-axis.

In the example below, we’ll use “Duration” for the x-axis and “Calories” for the y-axis, by including the x and y arguments as follows:
x = 'Duration', y = 'Calories'.

Example

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv(‘data.csv’)

df.plot(kind = ‘scatter’, x = ‘Duration’, y = ‘Calories’)

plt.show()

Result

img_pandas_plot_scatter

Note: In the previous example, we found a correlation of 0.922721 between “Duration” and “Calories,” which led to the conclusion that a longer duration results in more calories burned.

After examining the scatter plot, I agree with this conclusion.

Let’s create another scatter plot, this time for columns with a weak relationship, such as “Duration” and “Maxpulse,” which have a correlation of 0.009403.

Example

A scatter plot where there is no relationship between the columns.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv(‘data.csv’)

df.plot(kind = ‘scatter’, x = ‘Duration’, y = ‘Maxpulse’)

plt.show()

Result

img_pandas_plot_scatter2