Print Dataframe
We can easily display the dataframe using the show() command. Its syntax is as follows
show()
show(numRows : scala.Int)
show(truncate : scala.Boolean)
show(numRows : scala.Int, truncate : scala.Boolean)
show(numRows : scala.Int, truncate : scala.Int)
show(numRows : scala.Int, truncate : scala.Int, vertical : scala.Boolean)
We can run show command on our dataframe as follows.
class_df.show()
Output:
To understand the meaning of the arguments of show command, let us build a dataframe with more number of rows and larger names. Use the following code to build the required dataframe.
import org.apache.spark.sql.SparkSession
import scala.util.Random
val spark: SparkSession = SparkSession.builder().master("local[1]").getOrCreate()
val columns = Seq("Id", "Code_Name")
var data = Seq[(String, String)]()
for (i <- 1 to 30) {
val randomString = Random.alphanumeric.take(30).mkString
data = data :+ (i.toString, randomString)
}
val class_df = spark.createDataFrame(data).toDF(columns:_*)
class_df.show()
Output:
Here the output gives us too many rows and even the rows are itself truncated. Now let us see the various ways of using the show command to give us better formatted displays of the dataframe
Example 1: Using numRows
This will print only the specified number of rows in the output.
class_df.show(3)
Output:
Method 2: Using truncate (as Boolean)
This will print the data without truncating any values.
class_df.show(numRows = 3, truncate = false)
Output:
Here the code_name column’s values are fully displayed.
Example 3: Using truncate (as Integer)
We can provide a numeric value to truncate to specify the maximum number of characters to be displayed for each value. We will restrict the code_name to its first 6 characters as follows.
class_df.show(numRows = 3, truncate = 9)
Output:
Example 4: Using vertical
The vertical argument shows each row in a vertical manner by printing each column in a new line. The vertical argument can only be specified both the other arguments are specified. Let us print the first three rows vertically in the following example.
class_df.show(numRows = 3, truncate = 9, vertical = true)
Output:
As it can be seen each column in each row is printed in a new row. This is just another format of printing the dataframe.
How to print dataframe in Scala?
Scala stands for scalable language. It was developed in 2003 by Martin Odersky. It is an object-oriented language that provides support for functional programming approach as well. Everything in scala is an object e.g. – values like 1,2 can invoke functions like toString(). Scala is a statically typed language although unlike other statically typed languages like C, C++, or Java, it doesn’t require type information while writing the code. The type verification is done at the compile time. Static typing allows to building of safe systems by default. Smart built-in checks and actionable error messages, combined with thread-safe data structures and collections, prevent many tricky bugs before the program first runs.
Contact Us