Drawing a Line Plot with Seaborn

Introduction

A line plot is a graph that displays data using a number line. Many tools can be used to plot and visualize data. In this tutorial, you will do it with a powerful Python library for data visualization called Seaborn.

Requirements

For this tutorial, you need Python and Seaborn. You can install them separately on your computer, but Anaconda has all of them in one package. It is a free and open-source distribution of the Python and R programming languages for scientific computing (data science, machine learning applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment.

You need to download and install Anaconda on your machine if you have not yet done so (it has all the required libraries for this exercise, which are seaborn, pandas and matplotlib for data visualization). The procedure is simple, but if you have any challenges doing it, feel free to contact us.

How to create a single line plot with seaborn

For your coding, you can use any text editor (Notepad++, Sublime text etc.). We are going to use Jupyter Notebook (the text editor installed with Anaconda).

The syntax

The syntax to draw a single line plot with seaborn is:

seaborn.lineplot(x, y, data)

where:

x = Data variable for the x-axis

y = Data variable for the y-axis

data = Object pointing to the entire data set or data values

Note: Though this syntax has only 3 parameters, the seaborn lineplot function has more than 25 parameters as you can see from this screenshot. (Refer to the seaborn documentation for more information)

line plot syntax

Data values can be created within the code or loaded from a dataset.

Example 1: Using random data created within the code

Suppose that the profit made by a firm for the past 10 years (2009 to 2019) is as follows:

year = [2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019]

profit = [75.2, 76, 80.2, 86, 86.6, 90, 91.4, 85.4, 88, 90]

To plot a graph showing the evolution of the profit during the 10 years (line plot) using seaborn, proceed as follows:

Step 1: Import the relevant libraries (seaborn, pandas and matplotlib)

import pandas as pd

import seaborn as sbn

import matplotlib.pyplot as plt

Step 2: Create data values

year = [2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019]

profit = [75.2, 76, 80.2, 86, 86.6, 90, 91.4, 85.4, 88, 90]

Step 3: Create data frame (df)

df = pd.DataFrame({"Year":Year, "Profit":Profit})

Step 4: Draw line plot

sbn.lineplot(x = "Year", y = "Profit", data=df)

Step 5: Show line plot

plt.show()

Your jupyter notebook should look like this (the comments are optional, but it is a good practice to use them if you’re working with others):

jupyter notebook for line plot in seaborn

Click on the Run button of the jupyter notebook or Shift + Enter to run the code. If everything is OK, you should have an output like the one in the figure below, with a line plot showing the relationship between two data variables – “year" and “profit"

line plot showing year vs profit

Example 2: Using a dataset to draw a line plot

The data for the dataset will come from the table below showing the temperature in three major cities Dallas, Berlin and Ottawa for 7 days (fictitious data just for the exercise).

Day Dallas Berlin Ottawa
day1 33 74 40
day2 31 76 42
day3 29 74 44
day4 23 81 37
day5 31 82 29
day6 21 80 44
day7 26 76 46

Create a csv file named “temperature" or download it from the following link: https://github.com/JoeMaabo/seaborn.

If you want to put your dataset in the same location with the file containing the code, do as follows in jupyter notebook:

upload button in jupyter
choosing data to upload

Step 1: Import the relevant libraries (seaborn, pandas and matplotlib)

import pandas as pd

import seaborn as sbn

import matplotlib.pyplot as plt

Step 2: Get dataset and create the dataframe

df = pd.read_csv('temperature.csv')

Step 3: Draw line plot

sbn.lineplot(x = "Day", y = "Dallas", data = df)

Step 5: Show line plot

plt.show()

Your code should look like the one on the screenshot below:

line plot code in jupyter notebook

Note that we are interested only in the temperature of one city (Dallas, y = “Dallas"). After running the code (Click on the Run button on jupyter or Shift + Enter) you should have the following output.

Note: Until now, we have been using three parameters: x, y and data. It is good to know that there are many parameters that can be used to ameliorate the presentation of a line plot (Refer to the seaborn documentation).

Drawing Multiple Line Plots

To plot multiple lines on the same graph, you might need to reshape your dataset from wide (like the one that we used in the above example) to long. The reshaping can be done manually or you can use the melt function of pandas. Let’s look at the two possibilities.

Reshaping the dataset manually

You can download the reshaped dataset from the following link: https://github.com/JoeMaabo/seaborn. (Get the file “reshaped_temp.csv").

Below are the steps to draw multiple line plots on the same graph:

Step 1: Import the relevant libraries (seaborn, pandas and matplotlib)

import pandas as pd

import seaborn as sbn

import matplotlib.pyplot as plt

Step 2: Load the reshaped dataset

df = pd.read_csv('reshaped_temp.csv')

Note:

  • To view the dataset use the code: print(df)
data set printed out
  • To view the columns use the code: df.columns
showing the columns

Step 3: Draw line plots

sbn.lineplot(x = "Day", y = "Temperature", hue = "City", data = df)

Notice the parameter “hue" that has been added to the syntax to group data by city.

Step 5: Show line plots

plt.show()

Your code should look like the one on the screenshot below.

final code for method 1

Output

output for method 1

Reshaping dataset using the melt function of pandas

The melt() function is used to unpivot/reshape a given DataFrame from wide format to long format. With this function, you don’t need to do any modification on the original dataset.

Step 1: Import the relevant libraries (seaborn, pandas and matplotlib)

import pandas as pd

import seaborn as sbn

import matplotlib.pyplot as plt

Step 2: Load the dataset (original data that has not been reshaped)

df = pd.read_csv('temperature.csv')

Step 3: Reshape the dataset using the pandas melt function

df_data = pd.melt(df, id_vars=["Day"], var_name = "City", value_name = "Temperature" )

Note: id_vars, var_name and value_name are parameters of the melt function. (Refer to the pandas documentation for more information).

Step 4: Draw the line plots

sbn.lineplot(x='Day', y='Temperature', hue='City', data=df_data)

Step 5: Show line plots

plt.show()

Your code should look like the one on the screenshot below.

code for second method

Output

output for second method

It has been a long journey. We hope that you have understood the basics of line plot drawing with seaborn. If you want to dig further, we encourage you to consult the seaborn documentation.

Related: If you like line plots, you might also like box plots. Check out this tutorial for drawing box plots, also using Pandas.

Leave a Reply

Your email address will not be published. Required fields are marked *

privacy policy