What is the difference between Hadoop and Spark?
A simple answer :
Hadoop implements a mapreducer that scales the work on cluster computers (virtually or not). Spark is a framework built on it that integrates a lot of libraries for many things (for example mllib is about the integration of machine learning and statistic functions).