Spark - (Executor) Cache

> Database > Spark > Spark - Cluster

1 - About

Data - Cache in Spark.

Each executor has a cache.

Advertising

3 - Example

  • lines is recomputed
lines = sc.textFile("...", 4)
comments = lines.filter(isComment) 
print lines.count()
print comments.count() # lines is recomputed
  • lines is NOT recomputed but get from the cache
lines = sc.textFile("...", 4)
lines.cache() # save, lines is NOT recomputed when comments.count() is called
comments = lines.filter(isComment) 
print lines.count()
print comments.count()

4 - Documentation / Reference

db/spark/cluster/cache.txt · Last modified: 2018/06/23 22:01 by gerardnico