As the world becomes more and more automated thanks to artificial intelligence and machine learning, data collection and analysis is more complex than ever before – at SequenceIQ we have a clear roadmap to offer an answer for all who seek a solution for their increasing big data problems.

For organizations that want to combine web, mobile, spatial, sensor and other historical or streaming based large data sets and discover relationships on massive amounts of multidimensional data, we offer an easy and intuitive solution with our services and product stack.


Cloudbreak is our open source, cost effective way to start and run multiple instances and versions of Hadoop clusters in the cloud, Docker containers or bare metal. It is a cloud and infrastructure agnostic and cost effictive Hadoop As-a-Service platform API. Provides automatic scaling, secure multi tenancy and full cloud lifecycle management.

With Cloudbreak you are one click away from your on demand Hadoop cluster.

Read the documentation or start using Cloudbreak now.


Allows the option to choose the favorite cloud provider and pricing model. The API translated the calls towards different cloud vendors - one common API, provision everywhere.

Elastic and scalable

Cloudbreak API can provision an arbitrary number of Hadoop nodes - the API does the hard work and span up the cluster, configure the networks and the selected Hadoop services. As the workload changes, the API allows you to add or remove nodes on the fly.


Blueprints are your Hadoop cluster stack definition and component layout. We support multiple blueprints of different Hadoop distributions.

Note: Cloudbreak is cloud and Hadoop provider agnostic. We encourage our customers to deploy the platform on their preferred cloud provider. We believe that vendor locking seize innovation, thus we support different Hadoop distributions - Apache Hadoop, Hortonworks Data Platform. Cloudbreak is HDP Certified and YARN Ready.

For technical details of Cloudbreak, the API documentation and code examples please visit our API.


Built on YARN features, cloud and VM resource management and cluster metrics, Periscope allows to associate SLA policies to applications and brings QoS for a multi-tenant Hadoop YARN cluster.

Once connected to a cluster (does not require pre installation) it aggregates the available resources and automatically enforces the applied SLA policies.

Read the documentation or start using Periscope now.

Auto scaling

Allows the option to re-prioritize running jobs, allocate resources to applications and autoscale clusters. Supports multiple cloud providers using Cloudbreak

YARN ready

Periscope monitors the application progress, the number of YARN containers/resources and their allocation, queue depths and the number of nodes and their health.

SLA policies

Allows configuration of SLA policies such as cluster up/downscaling, application re-ordering, predicted execution time, etc. There is no need for ahead capacity planning.

For technical details of Periscope, the API documentation and code examples please visit our API.


The idea behind Banzai Pipeline is that big data and actionable insights are not limited to expert organizations and departments but can be provided to all parts of a business in a self-service manner. Big Data development has never been easier.

Banzai Pipeline is a RESTful application development platform for building big data applications running on Hadoop YARN. The API allows developers to build data and job pipelines without any knowledge of the underlying big data technologies. It abstracts the frameworks and supports building a simple, secure, sequential or parallel execution of steps in an execution pipeline. Pipelines can be scheduled, reused as templates and run in batch or streaming mode.

Data and Job Pipeline

Build, reuse, link and run data and job pipelines using pre-developed building blocks (ML algortihms, batch and streaming jobs, ETL, data patterns). The pipeline is a collection of procedural steps, interactions, inputs and outputs - steps needed to describe a big data business process. Supports multiple running pipeline instances, A/B testing, training and evaluation. Pipeline instances can be scheduled, triggered by external events or started manually.

Actionable Insights

Set up rules, associate actions to individual rules and raise notifications on different output channels. As the data is streaming through the platform it interacts with the preconfigured rules and action pairs and reacts at millisecond latency - learning insights and acting on big data triggered rules has never been easier, all these at large scale.

Data Connectors

We support and build a large set of data connectors from open source standards to proprietary ones, and provide an SDK to build custom connectors.

For technical details of Banzai Pipeline, the API documentation and code examples please visit our API.