Mesos Introduction

Mesos support is part of TECHNICAL PREVIEW. It may not be suitable for production use.

At a high level, Cloudbreak deployment on Mesos is similar to Cloudbreak implementations on other cloud providers: HDP clusters are provisioned through Ambari with the help of blueprints, and Ambari server and agents run in Docker containers. However, there are a few major differences that you should consider before you start working with Cloudbreak on Mesos.

Differences with Other Cloud Provider Implementations

The Mesos integration doesn't start new instances and doesn't build new infrastructure on a cloud provider.

Cloudbreak expects a "bring your own Mesos" infrastructure, which means that you have to deploy Mesos first and then configure access to the existing Mesos deployment in Cloudbreak.

On other cloud providers, Cloudbreak first builds the infrastructure where Hadoop components are later deployed through Ambari. This involves creating or reusing the networking layer (virtual networks, subnets, and so on), provisioning new virtual machines in these networks from pre-existing cloud images, and starting docker containers on these VMs (nodes). However, the Mesos integration was designed not to include these steps, because in most cases users already have their own Mesos infrastructure on which they would like to deploy their cluster.

A Mesos credential in the Cloudbreak UI provides access to the Marathon API.

Marathon is a standard application scheduling framework for services in Mesos. Cloudbreak uses Marathon API to communicate with Mesos, so you need to deploy Marathon on the Mesos cluster and, when setting up access to Mesos in Cloudbreak, specify a Marathon API endpoint. Basic authentication and TLS on the Marathon API are not supported in the technical preview.

A Mesos template in the Cloudbreak UI means resource constraints instead of new resources.

Through the Cloudbreak UI, you can create Cloudbreak templates that define the virtual machines in a cluster's hostgroup that will be provisioned through the cloud provider API. You can create such templates for Mesos and you can link the VMs to a hostgroup, but for Mesos these templates mean resource constraints that will be demanded through the Marathon API, not resources that will created.

For example, consider these two scenarios:
- An AWS template with an instance type of m4.large and 4 pieces of 50 GB attached magnetic volumes will create a VM with these specs when Cloudbreak is building the cluster infrastructure. - A Mesos template with 2 CPU cores and 4 GB memory means that Cloudbreak will request the Marathon API to schedule the Ambari container on a node where these resources can be satisfied.

On Mesos, Cloudbreak doesn't start a gateway instance.

On other cloud providers, Cloudbreak deploys a gateway VM for every new cluster. The gateway VM runs a few containers, such as Ambari server, and, most importantly, it runs an Nginx server. All communication between Cloudbreak Deployer and a cluster deployed by Cloudbreak hsppens through this Nginx instance. This is done through a two-way TLS channel where the Nginx server is responsible for the TLS termination. Communication inside the cluster (for example, between Ambari server and agents) is not encrypted, but all communication from outside is secure. This allows Cloudbreak to be deployed outside of the private network of the cluster. The Mesos integration doesn't have a solution like this, so all communication between Cloudbreak and the Mesos cluster happens through an unencrypted channel. For this reason, on Mesos, Cloudbreak should be deployed inside the same private network (or in the same Mesos cluster) where the clusters will be deployed.

Limitations of the Technical Preview

No support for Consul or other custom DNS solution.

Unlike on other cloud providers where Cloudbreak uses Consul, on Mesos, Cloudbreak does not provide a custom DNS solution. In this technical preview, containers are deployed with net=host, so Mesos nodes must be set up manually in order to resolve hostnames to IP addresses and vice versa with reserve DNS. To manually resolve them, create the /etc/hosts file on each node in the cluster.

For example, consider this scenario:
- There are five nodes in the Mesos cluster: node1, node2, node3, node4 and node5 with IP addresses ranging from 10.0.0.1 to 10.0.0.5. - The /etc/hosts file on node1 should contain these entries that match IP addresses with hostnames:

    10.0.0.2 node2
    10.0.0.3 node3
    10.0.0.4 node4
    10.0.0.5 node5

Cloudbreak must be able to resolve the addresses of the Mesos slaves.

To make API requests (for example, to create a cluster), Cloudbreak must communicate with the Ambari server deployed in the Mesos cluster. After Cloudbreak instructs Marathon to deploy the Ambari server container somewhere in the Mesos cluster, it asks Marathon for the address of the node where the Ambari server was deployed and then tries to communicate with the Ambari server through that address. For example, consider the following scenario where Mesos cluster has 5 registered nodes: node1, node2, node3, node4, node5:

Considering that there is no gateway node and so the communication between Cloudbreak and the clusters is unencrypted, we suggest that you deploy Cloudbreak in the same private network as the clusters. If Cloudbreak is not in the same network as the clusters, add the addresses with a reachable IP to the /etc/hosts file on the machine where Cloudbreak is deployed.

Storage management needs to be improved

This is one of the two biggest limitations of the current Mesos integration. The current integration doesn't offer volume management, which means that data is stored inside Docker containers. This solution has a few problems that will be addressed in future releases:

IP-per-task is not supported yet

The second big limitation of the current Mesos integration is the lack of IP-per-task support. IP-per-task means that every task of an app (all the containers) deployed through Marathon will get their own network interface and an IP address. This feature is already available in Marathon but does not work in combination with Docker containers. In our current Mesos integration, containers are deployed with net=host, which means that to avoid port collisions only one container can be deployed per Mesos host; this is the case even with multiple clusters.

Recipes are not supported

Recipes (script extensions to an HDP cluster installation, supported by Cloudbreak) are not supported in the Mesos integration. Recipes are dependent on Consul's HTTP API, and the Mesos integration does not support Consul.

Cloudbreak Deployer

Before configuring Cloudbreak Deployer, you should know that:

Cloudbreak Deployer Installation

Install CLoudbreak Deployer

First, install the Cloudbreak Deployer manually on a VM inside your Mesos cluster's private network.

If you have your own installed VM, check the Initialize your Profile section here before starting the provisioning.

Open the cloudbreak-deployment directory:

cd cloudbreak-deployment

This directory contains configuration files and the supporting binaries for Cloudbreak Deployer.

Initialize your Profile

First, initialize deployer by creating a Profile file with the following content:

export UAA_DEFAULT_SECRET='[SECRET]'
export UAA_DEFAULT_USER_PW='[PASSWORD]'
export PUBLIC_IP='[PUBLIC_IP]'

The PUBLIC_IP is mandatory, because it is used to access the Cloudbreak UI.

Start Cloudbreak Deployer

To start the Cloudbreak application use the following command:

cbd start

This will start all the Docker containers and initialize the application.

The first time you start the Coudbreak app, the process will take longer than usual due to the download of all the necessary docker images.

The cbd start command includes the cbd generate command which applies the following steps:

Validate that Cloudbreak Deployer Has Started

After the cbd start command finishes, check the following:

   cbd doctor

If you need to run cbd update, refer to Cloudbreak Deployer Update. Most of the cbd commands require root permissions.

   cbd logs cloudbreak

You should see a mesage like this in the log: Started CloudbreakApplication in 36.823 seconds. Cloudbreak normally takes less than a minute to start.

Provisioning Prerequisites

A working Mesos cluster with Marathon

It is not the scope of Cloudbreak to provision a new Mesos cluster so it needs an already working Mesos cluster where it will be able to start HDP clusters. It is also required to have Marathon installed because Cloudbreak uses its API to schedule Docker containers.

Hostnames must be resolvable inside the Mesos cluster and also by Cloudbreak

Cloudbreak does not deploy a custom DNS solution like on other cloud providers, where Consul is used to provide addresses for every node. Containers are deployed with net=host and Mesos nodes must be set up manually in a way to be able to resolve each other's hostnames to IP addresses and vice versa with reserve DNS. This is a requirement of Hadoop and it is usually accomplished by setting up the /etc/hosts file on each node in the cluster, but it can also be provided by some DNS servers like Amazon's default DNS server in a virtual network.

Example:

    10.0.0.2 node2
    10.0.0.3 node3
    10.0.0.4 node4
    10.0.0.5 node5

Docker must be installed on Mesos slave nodes and Docker containerizer must be enabled

To be able to use the Docker containerizer, Docker must be installed on all the Mesos slave nodes. To install Docker, follow the instructions in their documentation here.

After Docker is installed, it can be configured for the Mesos slave, by adding the Docker containerizer to each Mesos slave configuration. To configure it, add docker,mesos to the file /etc/mesos-slave/containerizers on each of the slave nodes (or start mesos-slave with the --containerizers=mesos,docker flag, or set the environment variable MESOS_CONTAINERIZERS="mesos,docker"). You may also want to increase the executor timeout to 10 mins by adding 10mins to /etc/mesos-slave/executor_registration_timeout because it will allow time for pulling large Docker images.

Provisioning via Browser

You can log into the Cloudbreak application at https://<PUBLIC_IP>.

The main goal of the Cloudbreak UI is to easily create clusters on your own cloud provider, or on your existing Mesos cluster. This description details the Mesos setup - if you'd like to use a different cloud provider check out its manual.

This document explains the four steps that need to be followed to create Cloudbreak clusters from the UI:

IMPORTANT Make sure that you have sufficient quota (CPU, memory) in your Mesos cluster for the requested cluster size.

Setting up Marathon credentials

Cloudbreak works by connecting your Marathon API through so called Credentials, and then uses the API to schedule containers on your Mesos cluster. The credentials can be configured on the manage credentials panel on the Cloudbreak Dashboard.

To create a new Marathon credential follow these steps:

  1. Fill out the new credential Name
    • Only alphanumeric and lowercase characters (min 5, max 100 characters) can be applied
  2. Add an optional description
  3. Specify the endpoint of your Marathon API in this format: http://<marathon-address>:<port>. Example: http://172.16.252.31:8080.

Public in account means that all the users belonging to your account will be able to use this credential to create clusters, but cannot delete it.

Authentication and HTTPS to a Marathon API is not yet supported by Cloudbreak

Resource constraints

After your Marathon API is linked to Cloudbreak you can start creating resource constraint templates that describe the resources requested through the Marathon API when starting an Ambari container.

When you create a resource constraint template, Cloudbreak does not make any requests to Marathon. Resources are only requested after the create cluster button was pushed and Cloudbreak starts to orchestrate containers. These templates are saved to Cloudbreak's database and can be reused with multiple clusters to describe the same resource constraints.

A typical setup is to combine multiple templates in a cluster for the different types of nodes. For example you may want to request more memory for Spark nodes.

The resource contraint templates can be configured on the manage templates panel on the Cloudbreak Dashboard under the Mesos tab. You can specify the memory, CPU and disk needed by the nodes in a hostgroup. If Public in account is checked all the users belonging to your account will be able to use this resource to create clusters, but cannot delete it.

Defining Cluster Services

Blueprints

Blueprints are your declarative definition of a Hadoop cluster. These are the same blueprints that are used by Ambari.

You can use the 3 default blueprints pre-defined in Cloudbreak or you can create your own ones. Blueprints can be added from file, URL (an example blueprint) or the whole JSON can be written in the JSON text box.

The host groups in the JSON will be mapped to a set of instances when starting the cluster. Besides this the services and components will also be installed on the corresponding nodes. Blueprints can be modified later from the Ambari UI.

NOTE: It is not necessary to define all the configuration in the blueprint. If a configuration is missing, Ambari will fill that with a default value.

If Public in account is checked all the users belonging to your account will be able to use this blueprint to create clusters, but cannot delete or modify it.

Full size here.

A blueprint can be exported from a running Ambari cluster that can be reused in Cloudbreak with slight modifications. There is no automatic way to modify an exported blueprint and make it instantly usable in Cloudbreak, the modifications have to be done manually. When the blueprint is exported some configurations are hardcoded for example domain names, memory configurations...etc. that won't be applicable to the Cloudbreak cluster

Cluster deployment

After all the cluster resources are configured you can deploy a new HDP cluster.

Here is a basic flow for cluster creation on Cloudbreak's Web UI:

Configure Cluster tab

Choose Blueprint tab

Review and Launch tab

You can check the progress on the Cloudbreak Web UI if you open the new cluster's Event History. It is available if you click on the cluster's name.

Advanced options

There are some advanced features when deploying a new cluster, these are the following:

Validate blueprint This is selected by default. Cloudbreak validates the Ambari blueprint in this case.

Custom Image If you enable this, you can override the default image for provision.

Config recommendation strategy Strategy for how configuration recommendations will be applied. Recommended configurations gathered by the response of the stack advisor.

Cluster termination

You can terminate running or stopped clusters with the terminate button in the cluster details.

IMPORTANT Always use Cloudbreak to terminate the cluster instead of deleting the containers through the Marathon API. Deleting them first would cause inconsistencies between Cloudbreak's database and the real state and that could lead to errors

Edit on GitHub