Surendra Pratap Singh

Consistent Hashing? What the heck is that…..

Hashing is one of the main concepts that we are introduced to as we start off as a basic programmer. Be it 'data structures' or simple ‘object’ notion - hashing has a role to play everywhere. But when it comes to Big Data - like every thing else, the hashing mechanism is also exposed to some challenges which we generally don’t...

by Surendra Pratap Singh

27-Feb-2015

Big Data

Spark 1O3 – Understanding Spark Internals

In this post, I will present a technical “deep-dive” into Spark internals, including RDD and Shared Variables. If you want to know more about Spark and Spark setup in a single node, please refer previous post of Spark series, including Spark 1O1 and Spark 1O2. Resilient Distributed Datasets (RDD) - An RDD in is primary abstraction...

by Surendra Pratap Singh

13-Feb-2015

Big Data

Prediction Analysis using Knime

Prediction Analysis is the practice of extracting information from existing data sets in order to determine patterns and predict future outcomes and trends. There are various analytic and machine learning tool available in the market for predictive analysis. This post includes introduction of Knime followed by a sample use case of...

by Surendra Pratap Singh

05-Feb-2015

Big Data

Spark 1o2 – “Hello World”

This is the second blog of the Spark series. This blog post include setup of Spark environment followed by a small word count program. The idea behind the blog is to get hands on in Spark setup and running simple program on Spark. If you want to know more about Spark history and it's comparison with Hadoop, please refer Spark 1o1. ...

by Surendra Pratap Singh

21-Jan-2015

Blogs

Tips for writing a blog

Learn how to write a caption