How to use__getitem()__ magic method in Python
We will create a Spark DataFrame with at least one row using createDataFrame(). We then get a Row object from a list of row objects returned by DataFrame.collect(). We then use the __getitem()__ magic method to get an item of a particular column name. Given below is the syntax.
Syntax : DataFrame.__getitem__(‘Column_Name’)
Returns : value corresponding to the column name in the Row object
Python
# library import import pyspark from pyspark.sql import SparkSession from pyspark.sql import Row # Session Creation random_value_session = SparkSession.builder.appName( 'Random_Value_Session' ).getOrCreate() # Data filled in our DataFrame # 5 rows below rows = [[ 'All England Open' , 'March' , 'Super 1000' ], [ 'Malaysia Open' , 'January' , 'Super 750' ], [ 'Korea Open' , 'April' , 'Super 500' ], [ 'Hylo Open' , 'November' , 'Super 100' ], [ 'Spain Masters' , 'March' , 'Super 300' ]] # Columns of our DataFrame columns = [ 'Tournament' , 'Month' , 'Level' ] #DataFrame is created dataframe = random_value_session.createDataFrame(rows, columns) # Showing the DataFrame dataframe.show() # getting list of rows using collect() row_list = dataframe.collect() # Printing the first Row object # from which data is extracted print (row_list[ 0 ]) # Using __getitem__() magic method # To get value corresponding to a particular # column print (row_list[ 0 ].__getitem__( 'Level' )) print (row_list[ 0 ].__getitem__( 'Tournament' )) print (row_list[ 0 ].__getitem__( 'Level' )) print (row_list[ 0 ].__getitem__( 'Month' )) |
Output:
+----------------+--------+----------+ | Tournament| Month| Level| +----------------+--------+----------+ |All England Open| March|Super 1000| | Malaysia Open| January| Super 750| | Korea Open| April| Super 500| | Hylo Open|November| Super 100| | Spain Masters| March| Super 300| +----------------+--------+----------+ Row(Tournament='All England Open', Month='March', Level='Super 1000') Super 1000 All England Open Super 1000 March
How to get a value from the Row object in PySpark Dataframe?
In this article, we are going to learn how to get a value from the Row object in PySpark DataFrame.
Contact Us