How to Plot Mean of Dataframe in Pandas?

Pandas is another important Python module that is very popular among data scientists and machine learning developers. It provides many useful methods which can help to analyze the dataset in simple steps. The mean and medium of the dataframe in pandas can be found very easily with just one line of code by calling the method. In this article, we are going to learn how to plot mean of dataframe in Pandas using various methods. We will use Pandas, Matplotlib, and Searborn modules to plot the mean of the data frame.

How to plot the mean of DataFrame?

Mean is the average value of the data points. It is calculated by adding all the values and dividing by the number of values. We can plot the mean of the DataFrame using various modules, so first it is important to install these modules on your system.

  • Pandas
  • Matplotlib
  • Searborn

Use the pip command which is a very popular and simple way of installing any Python module.

Importing and analyzing the dataset

Before going to the plotting of the mean, first, we need to import the dataset. In this section, we will use a sample dataset to learn the plotting of the mean.

Let us import the dataset and then analyze it.

# importing the module
import pandas as pd

# dataset
data = pd.read_csv('Dushanbe_house.csv')

data.head()

Output:


Unnamed: 0	number_of_rooms	floor	area	latitude	longitude	price
0	0	1	1	58.0	38.585834	68.793715	330000
1	1	1	14	68.0	38.522254	68.749918	340000
2	2	3	8	50.0	NaN	NaN	700000
3	3	3	14	84.0	38.520835	68.747908	700000
4	4	3	3	83.0	38.564374	68.739419	415000

We will now drop the first column and the NaN values from the dataset so that we will have clean data.

# dropping column
data.drop('Unnamed: 0', axis=1, inplace=True)
data.dropna(inplace=True)

Now the dataset is clean and clear. We can move toward the plotting of the mean.

Plot Mean of the Column of Dataframe

Let us assume that we want to plot the mean of the area column. First, we need to find the mean of the area column and then plot the graph with the original dataset:

# importing the module
import matplotlib.pyplot as plt

# finding the mean of the data
df= data['area']
df.mean()

# plotting the dataset
plt.plot([i for i in range(len(df))], df)
plt.plot([i for i in range(len(df))], [df.mean() for i in range(len(df))])

# plot show
plt.show()

Output:

how to plot mean of dataframe in pandas

As you can see, the mean of the dataset is plotted along with the actual values using the Matplotlib module.

Finding Mean and Plotting Using Pandas

We will now find the means of all the columns and then plot them using a bar plot in pandas. Before plotting the mean values, we will drop the prices column because its value is very large and it will make our plotting ugly.

# plotting the mean values
data.drop('price', axis=1).mean().plot(kind='bar')

Output:

plot mean using bar chart

As you can see, with just one line of code, we were able to show the mean of each of the columns in a bar chart using the Pandas module.

Using Describe() Method to Get Statistical Value

One of the important methods in Pandas is the describe() method which shows the statistical values of the dataframe. Let us use this method to get more inside our dataframe.

# describe method
data.describe()

Output:

number_of_rooms	floor	area	latitude	longitude	price
count	3730.000000	3730.000000	3730.000000	3730.000000	3730.000000	3.730000e+03
mean	2.319035	6.553351	72.179893	38.553452	68.768399	5.348747e+05
std	1.036780	4.355972	33.330143	0.030199	0.056909	4.163996e+05
min	1.000000	0.000000	16.000000	37.511664	68.667721	4.500000e+02
25%	2.000000	3.000000	50.000000	38.530576	68.739065	3.200000e+05
50%	2.000000	5.000000	65.000000	38.560678	68.761022	4.500000e+05
75%	3.000000	10.000000	84.000000	38.572482	68.789177	6.300000e+05
max	6.000000	20.000000	370.000000	38.615876	71.509309	8.814000e+06

As you can see, the describe() method returns many important statistical values of the data frame.

Summary

Mean is the average value of the data points. It is calculated by taking the sum of all values and then dividing them by the total count of the values. In Pandas and Python, we have built-in methods to calculate the mean. In this short article, we learned how to plot the mean of the data frame using various methods in Python by taking examples.

You May Also Like

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top