Get unique values from a column in Pandas DataFrame

Conditional operation on Pandas DataFrame columns

The unique() function removes all duplicate values on a column and returns a single value for multiple same values. In this article, we will discuss how we can get unique values from a column in Pandas DataFrame.

Creating a Pandas Dataframe with Duplicate Elements

Create a sample Pandas dataframe with a dictionary of lists, say columns names are A, B, C, D, and E with duplicate elements.

Python3

# Import pandas package
import pandas as pd
 
# create a dictionary with five fields each
data = {
    'A': ['A1', 'A2', 'A3', 'A4', 'A5'],
    'B': ['B1', 'B2', 'B3', 'B4', 'B4'],
    'C': ['C1', 'C2', 'C3', 'C3', 'C3'],
    'D': ['D1', 'D2', 'D2', 'D2', 'D2'],
    'E': ['E1', 'E1', 'E1', 'E1', 'E1']}
 
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)

Get unique values from a column in Pandas DataFrame

Below are some examples by which we can get the unique values of a column in this dataframe.

Get the Unique Values of ‘B’ Column
Get the Unique Values of ‘E’ Column
Get Number of Unique Values in a Column
Using set() to Eliminate Duplicate Values from a Column
Using pandas.concat() and Unique() Methods
Using Series.drop_duplicates()

Get the Unique Values of ‘B’ Column

In this example, we are retrieving and printing the unique values from the ‘B’ column using the unique() method. The resulting unique values are ['B1', 'B2', 'B3', 'B4'].

Python3

# Import pandas package
import pandas as pd
 
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
 
# Get the unique values of 'B' column
df.B.unique()

Output

array(['B1', 'B2', 'B3', 'B4'], dtype=object)

Get the Unique Values of Pandas in ‘E’ Column

In this example, we create a pandas DataFrame from a dictionary and then retrieves the unique values from the ‘E’ column using the unique() method. The resulting unique values are ['E1'].

Python3

# Import pandas package
import pandas as pd
 
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
 
# Get the unique values of 'E' column
df.E.unique()

Output

array(['E1'], dtype=object)

Get Number of Unique Values in a Column

In this example, we create a pandas DataFrame from a dictionary and then calculates and prints the number of unique values in the ‘C’ column, excluding NaN values. The result is 3, indicating there are three unique values in column ‘C’.

Python3

# Import pandas package
import pandas as pd
 
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
 
# Get number of unique values in column 'C'
df.C.nunique(dropna=True)

Output

Eliminate Duplicate Values from a Column using set()

In this example, we create a pandas DataFrame from a dictionary and then uses the set() function to extract unique values from column ‘C’, eliminating duplicates. The resulting set, {'C1', 'C2', 'C3'}, represents the unique values in column ‘C’.

Python3

# Import pandas package
import pandas as pd
 
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
 
# Use set() to eliminate duplicate values in column 'C'
unique_values_set = set(df['C'])
 
# Print the unique values
print(unique_values_set)

Output

{'C1', 'C2', 'C3'}

Using pandas.concat() and Unique() Methods

In this example, we create a pandas DataFrame from a dictionary and then concatenates unique values from all columns using pd.concat(). The resulting NumPy array, when printed, displays all unique values from columns ‘A’ to ‘E’.

Python3

# Import pandas package
import pandas as pd
 
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
 
# Use pd.concat() to concatenate all columns and then apply unique()
unique_values_all_columns = pd.concat([df[col].unique() for col in df.columns])
 
# Print the unique values
print(unique_values_all_columns)

Output

['A1' 'A2' 'A3' 'A4' 'A5' 'B1' 'B2' 'B3' 'B4' 'C1' 'C2' 'C3' 'D1' 'D2' 'E1']

Using Series.drop_duplicates()

In this example, we create a pandas DataFrame from a dictionary and removes duplicates from columns ‘A’ and ‘D’ using the drop_duplicates() method. The resulting DataFrame, when printed, displays the unique values in columns ‘A’ and ‘D’, with NaN values where duplicates were removed from ‘D’.

Python3

# Import pandas package
import pandas as pd
 
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
 
# Use drop_duplicates() to remove duplicates from columns 'A' and 'D'
df['A'] = df['A'].drop_duplicates()
df['D'] = df['D'].drop_duplicates()
 
# Print the DataFrame after removing duplicates from columns 'A' and 'D'
print(df)

Output

    A   B   C   D   E
0  A1  B1  C1  D1  E1
1  A2  B2  C2  D2  E1
2  A3  B3  C3 NaN  E1
3  A4  B4  C3 NaN  E1
4  A5  B4  C3 NaN  E1

Tags:

#pandas-dataframe-program #Python pandas-dataFrame #Python-pandas #Technical Scripter 2018 #Python #Technical Scripter #python

Collapse multiple Columns in Pandas

Conditional operation on Pandas DataFrame columns

Get unique values from a column in Pandas DataFrame

Creating a Pandas Dataframe with Duplicate Elements

Python3

Get unique values from a column in Pandas DataFrame

Get the Unique Values of ‘B’ Column

Python3

Get the Unique Values of Pandas in ‘E’ Column

Python3

Get Number of Unique Values in a Column

Python3

Eliminate Duplicate Values from a Column using set()

Python3

Using pandas.concat() and Unique() Methods

Python3

Using Series.drop_duplicates()

Python3

Contact Us