Spark - Connection (Context)

> Database > Spark

1 - About

A Spark Connection is :

This object is called:

An instance of a context object is a spark app

Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object in your main program (called the driver program).

A Spark Connection object:

  • connect to a cluster managers (which allocate resources across applications)
  • acquires executors on nodes in the cluster
  • sends your application code (defined by JAR or Python files passed to SparkContext) to the executors.
  • sends tasks to the executors to run.

Every driver (Spark program) has a single Spark Connection object.

Advertising

3 - Management

3.1 - Initialization

When:

  • using the interactive Spark shell, this object is automatically created for you and given the name sc.
  • writing an application, you need to create one yourself.

3.2 - master connection url

The master connection URL is by default in the configuration file.