All Courses

#### Master's In Artificial Intelligence Job Guarantee Program

4.5 (1,292 Ratings)

559 Learners

More webinars

# Visualization with Python Pandas

Suraj Jain

a year ago

Table of Content
• Basic Plotting: plot
• Bar Plot
• Histograms
• Box Plots
• Area Plot
• Scatter Plot
• Pie Chart
In this article, I will try to take you through some of the basic and most used plots in python pandas.

### Basic Plotting: plot

Plot() method in pandas make plots of DataFrame and Series using matplotlib / pylab.
``````import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10,4),index=pd.date_range('1/1/2000',
periods=10), columns=list('ABCD'))
df.plot()
``````
Its the output is as follows –
If the index consists of dates, it calls gct().autofmt_xdate() to format the x-axis as shown in the above illustration.
We can plot one column versus another using the x and y keywords.
Plotting methods allow a handful of plot styles other than the default line plot. These methods can be provided as the kind keyword argument to plot().
These include −
• bar or barh for bar plots
• hist for histogram
• box for boxplot
• 'area' for area plots
• 'scatter' for scatter plot
• Pie Chart

## Bar Plot

What is a Bar Plot?
A barplot (or barchart) is one of the most common types of graphic. It shows the relationship between a numeric and a categoric variable. Each entity of the categoric variable is represented as a bar. The size of the bar represents its numeric value.
Let’s visualize it
A bar plot can be created in the following way −
``````import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.bar()``````
Its output is as follows –
To produce a stacked bar plot, we have to provide parameter stacked=true −
``````import pandas as pd
df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.bar(stacked=true)``````
Its the output is as follows –
Now if we want to get the horizontal bar plots, we will use the bar method −
``````import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.barh(stacked=true)
``````
Its output is as follows –

## Histograms

A histogram is a representation of the distribution of numerical data, where the data are binned and the count for each bin is represented.
Histograms can be plotted using the plot.hist() method. We can specify the number of bins.
``````import pandas as pd
import numpy as np

df = pd.DataFrame({'a':np.random.randn(1000)+1,'b':np.random.randn(1000),'c':
np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])

df.plot.hist(bins=20)
``````
Its output is as follows –
To plot different histograms for each column, use the following code −
``````import pandas as pd
import numpy as np

df=pd.DataFrame({'a':np.random.randn(1000)+1,'b':np.random.randn(1000),'c':
np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])

df.diff.hist(bins=20)
``````
Its output is as follows –

## Box Plots

Boxplot is used to visualize the distribution of values within each column.
It can be drawn calling Series.box.plot() and DataFrame.box.plot(), or DataFrame.boxplot() .
So, let’s visualize, here is a boxplot representing five trials of 10 observations of a uniform random variable on [0,1).
``````import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
df.plot.box()
``````
Its output is as follows –

## Area Plot

Area plot can be created using the Series.plot.area() or the DataFrame.plot.area() methods.
``````import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd'])
df.plot.area()
``````
Its output is as follows –

## Scatter Plot

Scatter the plot can be plot using the DataFrame.plot.scatter() methods.
``````import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(50, 4), columns=['a', 'b', 'c', 'd'])
df.plot.scatter(x='a', y='b')
``````
Its the output is as follows –

## Pie Chart

Pie the chart can be plot using the DataFrame.plot.pie() method.
``````import pandas as pd
import numpy as np

df = pd.DataFrame(3 * np.random.rand(4), index=['a', 'b', 'c', 'd'], columns=['x'])
df.plot.pie(subplots=true)
``````
Its output is as follows −