How to Count Occurrences of Specific Value in Pandas Column?

Pandas GroupBy - Count occurrences in column

In this article, we will discuss how to count occurrences of a specific column value in the pandas column.

Dataset in use:

We can count by using the value_counts() method. This function is used to count the values present in the entire dataframe and also count values in a particular column.

Syntax: data[‘column_name’].value_counts()[value]

where

data is the input dataframe

value is the string/integer value present in the column to be counted

column_name is the column in the dataframe

Example: To count occurrences of a specific value

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith', 
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java',
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})
 
# count values in name column
print(data['name'].value_counts()['sravan'])
 
# count values in subjects column
print(data['subjects'].value_counts()['php'])
 
# count values in marks column
print(data['marks'].value_counts()[89])

Output:

3
2
1

If we want to count all values in a particular column, then we do not need to mention the value.

Syntax:

data['column_name'].value_counts()

Example: To count the occurrence of a value in a particular column

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith', 
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java',
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})
 
# count all values in name column
print(data['name'].value_counts())
 
# count all values in subjects column
print(data['subjects'].value_counts())
 
# count all values in marks column
print(data['marks'].value_counts())
 
# count all values in age column
print(data['age'].value_counts())

Output:

If we want to get the results in order (like ascending and descending order), we have to specify the parameter

Syntax:

Ascending order:

data[‘column_name’].value_counts(ascending=True)

Descending Order:

data[‘column_name’].value_counts(ascending=False)

Example: To get results in an ordered fashion

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith',
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java',
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})
 
# count all values in name column in ascending order
print(data['name'].value_counts(ascending=True))
 
# count all values in subjects column in ascending order
print(data['subjects'].value_counts(ascending=True))
 
# count all values in marks column in descending order
print(data['marks'].value_counts(ascending=False))
 
# count all values in age column in descending order
print(data['age'].value_counts(ascending=False))

Output:

Dealing with missing values

Here we can count the occurrence with or without NA values. By using dropna parameter to include NA values if set to True, it will not count NA if set to False.

Syntax:

Include NA values:

data[‘column_name’].value_counts(dropna=True)

Exclude NA Values:

data[‘column_name’].value_counts(dropna=False)

Example: Dealing with missing values

Python3

# import pandas module
import pandas as pd
 
#import numpy
import numpy
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith', 'gnanesh', 
             'sravan', 'sravan', 'ojaswi', numpy.nan],
    'subjects': ['java', 'php', 'java', 'php', 'java', 'html/css', 
                 'python', 'R', numpy.nan],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90, numpy.nan],
    'age': [11, 23, 23, 21, 21, 21, 23, 21, numpy.nan]
})
 
# count all values in name column including NA
print(data['name'].value_counts(dropna=False))
 
# count all values in subjects column including NA
print(data['subjects'].value_counts(dropna=False))
 
# count all values in marks column excluding NA
print(data['marks'].value_counts(dropna=False))
 
# count all values in age column excluding NA
print(data['age'].value_counts(dropna=True))

Output:

Count values with relative frequencies

We are going to add normalize parameter to get the relative frequencies of the repeated data. It is set to True.

Syntax:

data[‘column_name’].value_counts(normalize=True)

Example: Count values with relative frequencies

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith', 
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java',
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})
 
# count all values in name  with relative frequencies
print(data['name'].value_counts(normalize=True))

Output:

sravan     0.375
ojaswi     0.125
ojsawi     0.125
bobby      0.125
rohith     0.125
gnanesh    0.125
Name: name, dtype: float64

Get details

If we want to get the details like count, mean, std, min, 25%, 50%,75%, max, then we have to use describe() method.

Syntax:

data['column_name'].describe()

Example: Get details

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith', 
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java', 
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})
 
# get about age
print(data['age'].describe())

Output:

count     8.000000
mean     20.500000
std       3.964125
min      11.000000
25%      21.000000
50%      21.000000
75%      23.000000
max      23.000000
Name: age, dtype: float64

Using size() with groupby()

Here this will return the count of all occurrences in a particular column.

Syntax:

data.groupby('column_name').size()

Example: Count of all occurrences in a particular column

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith',
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java',
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})
 
# get the size of name
print(data.groupby('name').size())

Output:

name
bobby      1
gnanesh    1
ojaswi     1
ojsawi     1
rohith     1
sravan     3
dtype: int64

Using count() with groupby()

Here this will return the count of all occurrences in a particular column across all columns.

Syntax:

data.groupby('column_name').count()

Example: Count of all occurrences in a particular column

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith',
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java',
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})
 
# get the count of name across all columns
print(data.groupby('name').count())

Output:

Using bins

If we want to get the count in a particular range of values, then the bins parameter is applied. We can specify the number of ranges(bins).

Syntax:

(data['column_name'].value_counts(bins)

where,

data is the input dataframe
column_name is the column to get bins
bins is the total number of bins to be specified

Example: Get count in particular range of values

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith', 
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java', 
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})
 
# get count of  age column with  6 bins
print(data['age'].value_counts(bins=6))
 
# get count of  age column with  4 bins
print(data['age'].value_counts(bins=4))

Output:

(19.0, 21.0]      4
(21.0, 23.0]      3
(10.987, 13.0]    1
(17.0, 19.0]      0
(15.0, 17.0]      0
(13.0, 15.0]      0
Name: age, dtype: int64
(20.0, 23.0]      7
(10.987, 14.0]    1
(17.0, 20.0]      0
(14.0, 17.0]      0
Name: age, dtype: int64

Using apply()

If we want to get a count of all columns across all columns, then we have to use apply() function. In that we will use value_counts() method.

Syntax:

data.apply(pd.value_counts)

Example: Get count of all columns across all columns

Python3

# import pandas module
import pandas as pd
 
# create a dataframe
# with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'bobby', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'html/css', 'python'],
    'marks': [98, 90, 78, 91, 87],
    'age': [11, 23, 23, 21, 21]
})
 
# get all count
data.apply(pd.value_counts)

Output:

Tags:

#pandas-dataframe-program #Python pandas-dataFrame #Python-pandas #Python #python

Pandas GroupBy - Count occurrences in column

How to Count Occurrences of Specific Value in Pandas Column?

Dataset in use:

Python3

Python3

Python3

Dealing with missing values

Python3

Count values with relative frequencies

Python3

Get details

Python3

Using size() with groupby()

Python3

Using count() with groupby()

Python3

Using bins

Python3

Using apply()

Python3

Contact Us