For those wanting to work with Big Data, it isn't enough to simply know a programming language and a small scale library. Once your data reaches many gigabytes, if not terabytes, in size, working with data becomes cumbersome. Your computer can only run so fast and store only so much. At this point, you would look into what kind of tooling is used for massive amounts of data. One of the tools that you would consider is called Apache Spark. In this post, we'll look at what is Spark, what can we do with Spark, and why to use Spark.
by Joseph Woolf