Spark - Join

Spark Pipeline

Function

Join

Rdd1 is an RDD of Id, Name Rdd2 is an RDD of Id, Day, Month

Rdd1= sc.parallelize([(1,'Nicolas')])
Rdd2 = sc.parallelize([(1,(24,07))])
Rdd1.join(Rdd2).collect()
[(1, ('Nicolas', (24, 7)))]







Share this page:
Follow us:
Task Runner