RDD.
collect
Return a list that contains all the elements in this RDD.
New in version 0.7.0.
a list containing all the elements
See also
RDD.toLocalIterator()
pyspark.sql.DataFrame.collect()
Notes
This method should only be used if the resulting array is expected to be small, as all the data is loaded into the driver’s memory.
Examples
>>> sc.range(5).collect() [0, 1, 2, 3, 4] >>> sc.parallelize(["x", "y", "z"]).collect() ['x', 'y', 'z']