Site icon Digital Creed

‘Spark is becoming more important than Hadoop’

Spark can run in the Hadoop ecosystem, or it can run in its own stand-alone environment. Over 25% of Spark projects today run outside of Hadoop, and the percentage is rising. Moshe Kranc, Chief Technology Officer, Ness Software Engineering Services (SES) talks about the big data trends for 2016 and Spark’s role over Hadoop. The views and opinions expressed in this article are entirely those of Moshe Kranc.

— Vinita Gupta Malu

Q. What are the Big Data trends that you observe in 2016?

Moshe: I’ve observed the following trends:

 

Q. What according to you would be the big data challenges in 2016? Explain the ways to tackle it.

Moshe: If there is one common thread that links 95% of Ness’s big data customers, it is FUD – fear, uncertainty, doubt. The field is full of competing or overlapping products, each of which claims to be the Holy Grail for big data. The big boys (Oracle, IBM, SAP, Terradata, etc.) all want to steer you towards their (costly) offering. The upstarts (Cloudera, HortonWorks, DataBricks, DataStax, etc.) all talk about use cases with dramatic savings, but do not tell about use cases where their product fails miserably. Depending on the vertical, there are dozens of one stop shops who offer to take the data and come back with vertical-specific insights.

Left to evaluate this cacophony of conflicting voices is your organization, with little experience in the big data minefield. No wonder we read so often about big data failures. The real culprit is not Hadoop – the real culprit is the silo approach that drove the company to make a choice without the benefit of outside advice or experience. The only way to cut through the hype around a product is to try it yourself, and/or talk to someone you trust who is using it. Another option is to partner with a company like Ness Software Engineering Services that has seen a broad range of big data projects and technologies, and has a proven track record of success.

Q. Will spark overtake Hadoop?

Moshe: Hadoop as a concept revolutionized the world of data processing, and ushered in the era of big data. But, Hadoop as a product ecosystem is certainly showing its age, and, for many use cases, it has been upstaged by more modern technologies. Therefore, Hadoop is not necessarily the safe choice for your big data use case. In the long run, Hadoop may lose out to newer products like Spark and Cassandra, which had the benefit of learning from Hadoop’s growing pains.

Q. What are the best practices for migrating a project from Hadoop to Spark?

————————————————————————————————————————

The views and opinions expressed in this article are entirely those of Moshe Kranc, Chief Technology Officer, Ness Software Engineering Services.

———————————————————————————————————————

Exit mobile version