A line plot is a graph that displays data using a number line. Many tools can be used to plot and visualize data. In this tutorial, you will do it with a powerful Python library for data visualization called Seaborn.
For this tutorial, you need Python and Seaborn. You can install them separately on your computer, but Anaconda has all of them in one package. It is a free and open-source distribution of the Python and R programming languages for scientific computing (data science, machine learning applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment.
You need to download and install Anaconda on your machine if you have not yet done so (it has all the required libraries for this exercise, which are seaborn, pandas and matplotlib for data visualization). The procedure is simple, but if you have any challenges doing it, feel free to contact us.
How to create a single line plot with seaborn
For your coding, you can use any text editor (Notepad++, Sublime text etc.). We are going to use Jupyter Notebook (the text editor installed with Anaconda).
The syntax to draw a single line plot with seaborn is:
seaborn.lineplot(x, y, data)
x = Data variable for the x-axis
y = Data variable for the y-axis
data = Object pointing to the entire data set or data values
Note: Though this syntax has only 3 parameters, the seaborn lineplot function has more than 25 parameters as you can see from this screenshot. (Refer to the seaborn documentation for more information)
Data values can be created within the code or loaded from a dataset.
Example 1: Using random data created within the code
Suppose that the profit made by a firm for the past 10 years (2009 to 2019) is as follows:
Your jupyter notebook should look like this (the comments are optional, but it is a good practice to use them if you’re working with others):
Click on the Run button of the jupyter notebook or Shift + Enter to run the code. If everything is OK, you should have an output like the one in the figure below, with a line plot showing the relationship between two data variables – “year" and “profit"
Example 2: Using a dataset to draw a line plot
The data for the dataset will come from the table below showing the temperature in three major cities Dallas, Berlin and Ottawa for 7 days (fictitious data just for the exercise).
If you want to put your dataset in the same location with the file containing the code, do as follows in jupyter notebook:
Step 1: Import the relevant libraries (seaborn, pandas and matplotlib)
import pandas as pd
import seaborn as sbn
import matplotlib.pyplot as plt
Step 2: Get dataset and create the dataframe
df = pd.read_csv('temperature.csv')
Step 3: Draw line plot
sbn.lineplot(x = "Day", y = "Dallas", data = df)
Step 5: Show line plot
Your code should look like the one on the screenshot below:
Note that we are interested only in the temperature of one city (Dallas, y = “Dallas"). After running the code (Click on the Run button on jupyter or Shift + Enter) you should have the following output.
Note: Until now, we have been using three parameters: x, y and data. It is good to know that there are many parameters that can be used to ameliorate the presentation of a line plot (Refer to the seaborn documentation).
Drawing Multiple Line Plots
To plot multiple lines on the same graph, you might need to reshape your dataset from wide (like the one that we used in the above example) to long. The reshaping can be done manually or you can use the melt function of pandas. Let’s look at the two possibilities.