How to use lit() In Python
In these methods, we will use the lit() function, Here we can add the constant column ‘literal_values_1’ with value 1 by Using the select method. The lit() function will insert constant values to all the rows. We will use withColumn() select the dataframe:
Syntax: df.withColumn(“NEW_COL”, lit(VALUE))
Example 1: Adding constant value in columns.
Python3
df.withColumn( 'Status' , lit( 0 )).show() |
Output:
Example 2: Adding constant value based on another column.
Python3
from pyspark.sql.functions import when, lit, col df.withColumn( "Great_Discount" , when(col( "Discount" ) > = 1000 ,lit( "Yes" )).otherwise(lit( "NO" ))).show() |
Output:
How to add a constant column in a PySpark DataFrame?
In this article, we are going to see how to add a constant column in a PySpark Dataframe.
It can be done in these ways:
- Using Lit()
- Using Sql query.
Creating Dataframe for demonstration:
Python3
# Create a spark session from pyspark.sql import SparkSession from pyspark.sql.functions import lit spark = SparkSession.builder.appName( 'SparkExamples' ).getOrCreate() # Create a spark dataframe columns = [ "Name" , "Course_Name" , "Months" , "Course_Fees" , "Discount" , "Start_Date" , "Payment_Done" ] data = [ ( "Amit Pathak" , "Python" , 3 , 10000 , 1000 , "02-07-2021" , True ), ( "Shikhar Mishra" , "Soft skills" , 2 , 8000 , 800 , "07-10-2021" , False ), ( "Shivani Suvarna" , "Accounting" , 6 , 15000 , 1500 , "20-08-2021" , True ), ( "Pooja Jain" , "Data Science" , 12 , 60000 , 900 , "02-12-2021" , False ), ] df = spark.createDataFrame(data).toDF( * columns) # View the dataframe df.show() |
Output:
Contact Us