Creating SQLContext

1. Using SparkContext

We can create an SQLContext from a sparkcontext. The constructor is as follows:

public SQLContext(SparkContext sparkContext)

We can create a simple sparkcontext object with “master” (the cluster url) being set to “local” (just use the current machine) and “appName” to “createSQLContext”. We can then supply this sparkcontext to the SQLContext constructor.

Scala
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext

object createSQLContext {
  def main(args: Array[String]): Unit = {
    val sc = new SparkContext("local[*]", "createSQLContext")
    val sqlc = new SQLContext(sc)
    println(sqlc)
  }
}

Output:

The SQLContext Object

Explanation:

As you can see above we have created a new SQLContext object. Although we were successful but this method is deprecated and SQLContext is replaced with SparkSession. SQLContext is kept in newer versions only for backward compatibility.

2. Using Existing SQLContext Object

We can also use an existing SQLContext object to create a new SQLContext object. Every SQLContext provides a newSession API to create a new object based on the same SparkContext object. The API is as follows:

def newSession(): SQLContext

// Returns a SQLContext as new session, with separated SQL configurations, temporary tables, registered functions, but sharing the same SparkContext, cached data and other things

Below is the Scala program to implement the approach:

Scala
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext

object createSQLContext {
  def main(args: Array[String]): Unit = {
    val sc = new SparkContext("local[*]", "createSQLContext")
    val sqlc = new SQLContext(sc)
    val nsqlc = sqlc.newSession()
    println(nsqlc)
  }
}

Output:

The SQLContext Object

Explanation:

As you can see above we have created a new SQLContext object. Although we were successful but this method is deprecated and SQLContext is replaced with SparkSession. SQLContext is kept in newer versions only for backward compatibility.

3. Using SparkSession

The latest way (as of version 3.5.0) is to use SparkSession object. The SparkSession is a culmination of various previous contexts and provides a unified interface for all of them. We can create a SparkSession object using the builder API and then access the SQLContext object from it as follows:

Scala
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SparkSession

object createSQLContext {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession
      .builder()
      .appName("createSQLContext")
      .master("local[*]")
      .getOrCreate()
    println(spark.sqlContext)
  }
}

Output:

The SQLContext Object

Explanation:

As you can see we accessed the SQLContext object from inside the SparkSession object.



How to Create SQLContext in Spark Using Scala?

Scala stands for scalable language. It was developed in 2003 by Martin Odersky. It is an object-oriented language that provides support for functional programming approach as well. Everything in Scala is an object. It is a statically typed language although unlike other statically typed languages like C, C++, or Java, it doesn’t require type information while writing the code. The type verification is done at the compile time. Static typing allows to building of safe systems by default. Smart built-in checks and actionable error messages, combined with thread-safe data structures and collections, prevent many tricky bugs before the program first runs.

This article focuses on discussing steps to create SQLContext in Spark using Scala.

Table of Content

  • What is SQLContext?
  • Creating SQLContext
    • 1. Using SparkContext
    • 2. Using Existing SQLContext Object
    • 3. Using SparkSession

Similar Reads

What is SQLContext?

The official definition in the documentation of Spark is:...

Creating SQLContext

1. Using SparkContext...