How to use__getitem()__ magic method in Python

We will create a Spark DataFrame with at least one row using createDataFrame(). We then get a Row object from a list of row objects returned by DataFrame.collect(). We then use the __getitem()__ magic method to get an item of a particular column name. Given below is the syntax.

Syntax : DataFrame.__getitem__(‘Column_Name’)

Returns : value corresponding to the column name in the Row object

Python




# library import
import pyspark
from pyspark.sql import SparkSession
from pyspark.sql import Row
  
# Session Creation
random_value_session = SparkSession.builder.appName(
    'Random_Value_Session'
).getOrCreate()
  
# Data filled in our DataFrame
# 5 rows below
rows = [['All England Open', 'March', 'Super 1000'],
        ['Malaysia Open', 'January', 'Super 750'],
        ['Korea Open', 'April', 'Super 500'],
        ['Hylo Open', 'November', 'Super 100'],
        ['Spain Masters', 'March', 'Super 300']]
  
# Columns of our DataFrame
columns = ['Tournament', 'Month', 'Level']
  
#DataFrame is created
dataframe = random_value_session.createDataFrame(rows,
                                                 columns)
  
# Showing the DataFrame
dataframe.show()
  
# getting list of rows using collect()
row_list = dataframe.collect()
  
# Printing the first Row object
# from which data is extracted
print(row_list[0])
  
# Using __getitem__() magic method
# To get value corresponding to a particular
# column
print(row_list[0].__getitem__('Level'))
print(row_list[0].__getitem__('Tournament'))
print(row_list[0].__getitem__('Level'))
print(row_list[0].__getitem__('Month'))


Output: 

+----------------+--------+----------+
|      Tournament|   Month|     Level|
+----------------+--------+----------+
|All England Open|   March|Super 1000|
|   Malaysia Open| January| Super 750|
|      Korea Open|   April| Super 500|
|       Hylo Open|November| Super 100|
|   Spain Masters|   March| Super 300|
+----------------+--------+----------+

Row(Tournament='All England Open', Month='March', Level='Super 1000')
Super 1000
All England Open
Super 1000
March

How to get a value from the Row object in PySpark Dataframe?

In this article, we are going to learn how to get a value from the Row object in PySpark DataFrame.

Similar Reads

Method 1 : Using __getitem()__ magic method

We will create a Spark DataFrame with at least one row using createDataFrame(). We then get a Row object from a list of row objects returned by DataFrame.collect(). We then use the __getitem()__ magic method to get an item of a particular column name. Given below is the syntax....

Method 2 : Using asDict() method

...

Method 3: Imagining Row object just like a list

We will create a Spark DataFrame with atleast one row using createDataFrame(). We then get a Row object from a list of row objects returned by DataFrame.collect(). We then use the asDict() method to get a dictionary where column names are keys and their row values are dictionary values. Given below is the syntax:...

Contact Us