PySpark - Install and configuration

Card Puncher Data Processing

Client

PySpark - Installation and configuration on Idea (PyCharm)

Configuration

Spark - Configuration

PYSPARK_PYTHON

env:

  • PYSPARK_PYTHON : Python binary executable to use for PySpark in both driver and workers (default is python2.7 if available, otherwise python). Property spark.pyspark.python take precedence if it is set
export PYSPARK_PYTHON=${PYSPARK_PYTHON:-/usr/bin/anaconda/bin/python}

conf:

  • spark.yarn.appMasterEnv.PYSPARK_PYTHON. Example of value: /usr/bin/anaconda/bin/python

PYSPARK3_PYTHON

  • spark.yarn.appMasterEnv.PYSPARK3_PYTHON. Example of value: /usr/bin/anaconda/envs/py35/bin/python3

PYSPARK_DRIVER_PYTHON

  • PYSPARK_DRIVER_PYTHON - Python binary executable to use for PySpark in driver only (default is PYSPARK_PYTHON). Property spark.pyspark.driver.python take precedence if it is set







Share this page:
Follow us:
Task Runner