Technology — 09 May 2016

Tying Together the Disjointed Big Data Development Process

Big data has moved beyond hype to production. But what IT has not yet figured out is how to efficiently manage the big data development team and infrastructure. This mismatch causes a drain on productivity that is an impediment to reaping the benefits of big data analysis.

Hydrosphere solves this problem the same way that Agile, devops, Docker, and Ansible did that for traditional development projects. It builds out, tests, and configures the entire big data platform and analytics code with one overarching tool. This tears down the wall between big data architects and data scientists. For example, this lets the data scientists work in the sandbox on live Spark data sets instead of spreadsheet subsets and facsimiles. It lets them code and test Scala, and Python code and release it into the build using the blue-green approach. This code is carried along together with the virtual machines, networks, storage, and clusters as they are laid down by the big data engineer. This approach imposes on the entire team the discipline that they must write code and abstract everything so that it can be built in an automated fashion. The end result is continuous delivery for analytics.

Big data projects have built-in complications that make them more challenging to manage than other types of software development. The software upon which it is based is rapidly changing, because it is so new. The very idea of streaming, unstructured data is hard to grasp for traditional row-and-column Java programmers. The esoteric algorithms, math, and statistics that powers analytics are incomprehensible to most people beyond the data scientists.

Hydrosphere rises to this challenge by giving the project the tools they need to bring together the many players on the project team with automated tools. The software delivers measurable improvements in the build and release cycle. It reduces the time it takes to build the ecosystem by up to a factor of 10. It shortens the release cycles by a factor of 5 by allowing continuous integration and delivery. In sum, it speeds up the time that it takes for product managers, product planners, and data scientists to take ideas they have etched out on the whiteboard and deliver a working solution. That’s the message that business managers want to hear.

About Hydrosphere

Hydrosphere is a Silicon Valley based startup. Founded just this year, in 2016, Hydrosphere is the brainchild of big data architects, scientists, programmers, and startup veterans who understand first hand the difficulties faced by big data projects and have committed to developing tools to fix all that.


For further information please contact:

Stepan Pushkarev, Co-founder and CTO

125 University Avenue
Suite 290
Palo Alto, California 94301-1280



About Author

(0) Readers Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

seven + = 12