AWS Images

We have pre-built Cloudbreak Deployer cloud images for AWS with the Cloudbreak Deployer pre-installed. Go to your AWS Management Console to launch the latest Cloudbreak Deployer image in your region.

As an alternative to using the pre-built cloud images for AWS, you can install Cloudbreak Deployer on your own VM. For more information, see the installation instructions.

Prerequisites

Ports

Make sure that you have opened the following ports on your security group:

Cloudbreak Deployer AWS Image Details

VM Requirements

When selecting an instance type, consider these minimum and recomended requirements:

To learn about all requirements, see System Requirements.

Cloudbreak Deployer Setup on AWS

Before getting started with Cloudbreak Deployer, you should know that:

In the previous step, you should have already set up a VM with Cloudbreak Doployer either using the AWS Cloud Images or by installing the Cloudbreak Deployer manually on your own VM.

Now you need to connect to the previously created cbd VM.

Cloudbreak Deployment Directory

To navigate to the cloudbreak-deployment directory, run:

cd /var/lib/cloudbreak-deployment/

This directory contains configuration files and the supporting binaries for Cloudbreak Deployer.

Initialize Your Profile

First, initialize cbd by creating a Profile file:

cbd init

This will create a Profile file in the current directory. Open the Profile file and check the PUBLIC_IP. PUBLIC_IPis mandatory, because it is used to access the Cloudbreak UI. In some cases the cbd tool tries to guess it. If cbd cannot get the IP address during the initialization, set the appropriate value.

Start Cloudbreak Deployer

To start the Cloudbreak application use the following command:

cbd start

This will start all the Docker containers and initialize the application.

The first time you start the Coudbreak app, the process will take longer than usual due to the download of all the necessary docker images.

The cbd start command includes the cbd generate command which applies the following steps:

Validate that Cloudbreak Deployer Has Started

After the cbd start command finishes, check the following:

Configure Role-based Credentials

There are two ways to create AWS credentials in Cloudbreak:

Key-based: This requires your AWS access key and secret key pair. Cloudbreak will use these keys to launch the resources. For starters, this is a simpler option that does not require additional configuration. You will provide the keys later when you provision an HDP cluster.

Role-based: This requires a valid IAM role with "AssumeRole" policy. Cloudbreak will assume this role to get temporary access and the access/secret key pair.

To configure role-based credentials, start your instance with an "AssumeRole" policy. For more information, see Using Instance Profiles and Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances.

Alternatively, you can set your AWS keys of an IAM user with an"AssumeRole" policy in the Profile file:

export AWS_ACCESS_KEY_ID=AKIA**************W7SA
export AWS_SECRET_ACCESS_KEY=RWCT4Cs8******************/*skiOkWD

If you want to use instance profile, do not set these variables. If you want to use Cloudbreak with Role ARNs instead of keys, make sure that the instance profile role can assume roles on AWS.

Optional Configurations

You can perform the following optional configurations:

Set Custom Tags

In order to differentiate launched instances, we give you the option to use custom tags on your AWS resources deployed by Cloudbreak. You can use the tagging mechanism with the following variables.

If you want just one custom tag on your Cloudformation resources, set this variable :

export CB_AWS_DEFAULT_CF_TAG=whatever

Then the name of the tag will be CloudbreakId and the value will be whatever.

If you need more specific tagging, set this variable:

export CB_AWS_CUSTOM_CF_TAGS=myveryspecifictag:veryspecific

Then the name of the tag will be myveryspecifictag and the value will be veryspecific. You can specify a list of tags here with a comma separated list; for example: tag1:value1,tag2:value2,tag3:value3.

Cluster Provisioning Prerequisites

IAM Role Setup

If you want to use your Aws Access Key and your Secret Access Key to authenticate to Amazon then please use the Key based authentication and you do not need to setup an IAM Role.

Cloudbreak works by connecting your AWS account through so called Credentials, and then uses these credentials to create resources on your behalf.

IMPORTANT Cloudbreak deployment uses two different AWS accounts for two different purposes:

These accounts are usually the same when the end user is the same who deployed the Cloudbreak server, but it allows Cloudbreak to act as a SaaS project as well if needed.

Credentials use IAM Roles to give access to the third party to act on behalf of the end user without giving full access to your resources. This IAM Role will be assumed later by an IAM user.

AWS IAM Policy that grants permission to assume a role

You cannot assume a role with root account, so you need to create an IAM user with an attached Inline policy and then set the Access key and Secret Access key in the Profile file (check this description out).

The sts-assume-role IAM user policy must be configured to have permission to assume roles on all resources. Here it is the policy to configure the sts:AssumeRole for all Resource:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1400068149000",
      "Effect": "Allow",
      "Action": [
        "sts:AssumeRole"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}

To connect your (end user) AWS account with a credential in Cloudbreak you'll have to create an IAM role on your AWS account that is configured to allow the third-party account to access and create resources on your behalf. The easiest way to do this is with cbd commands (but it can also be done manually from the AWS Console):

cbd aws generate-role  - Generates an AWS IAM role for Cloudbreak provisioning on AWS
cbd aws show-role      - Show assumers and policies for an AWS role
cbd aws delete-role    - Deletes an AWS IAM role, removes all inline policies

The generate-role command creates a role that is assumable by the Cloudbreak Deployer AWS account and has a broad policy setup. This command creates a role with the name cbreak-deployer by default. If you'd like to create role with a different name or multiple roles, you need to add this line to your Profile:

export AWS_ROLE_NAME=my-cloudbreak-role

You can check the generated role on your AWS console, under IAM roles: Full size here.

Generate a New SSH Key

All the instances created by Cloudbreak are configured to allow key-based SSH, so you'll need to provide an SSH public key that can be used later to SSH onto the instances in the clusters you'll create with Cloudbreak. You can use one of your existing keys or you can generate a new one.

To generate a new SSH keypair:

ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
# Creates a new ssh key, using the provided email as a label
# Generating public/private rsa key pair.
# Enter file in which to save the key (/Users/you/.ssh/id_rsa): [Press enter]
You'll be asked to enter a passphrase, but you can leave it empty.

# Enter passphrase (empty for no passphrase): [Type a passphrase]
# Enter same passphrase again: [Type passphrase again]

After you enter a passphrase the keypair is generated. The output should look something like below.

# Your identification has been saved in /Users/you/.ssh/id_rsa.
# Your public key has been saved in /Users/you/.ssh/id_rsa.pub.
# The key fingerprint is:
# 01:0f:f4:3b:ca:85:sd:17:sd:7d:sd:68:9d:sd:a2:sd your_email@example.com

Later you'll need to pass the .pub file's contents to Cloudbreak and use the private part to SSH to the instances

Cluster Provisioning via Browser

You can log into the Cloudbreak application at https://<Public_IP>/.

The main goal of the Cloudbreak UI is to easily create clusters on your own cloud provider account. This description details the AWS setup - if you'd like to use a different cloud provider check out its manual.

This document explains the four steps that need to be followed to create Cloudbreak clusters from the UI:

IMPORTANT Make sure that you have sufficient qouta (CPU, network, etc) for the requested cluster size

Setting up AWS Credentials

Cloudbreak works by connecting your AWS account through so called Credentials, and then uses these credentials to create resources on your behalf. The credentials can be configured on the manage credentials panel on the Cloudbreak Dashboard.

To create a new AWS credential follow these steps:

  1. Select the credential type. For instance, select the Role Based
  2. Fill out the new credential Name
    • Only alphanumeric and lowercase characters (min 5, max 100 characters) can be applied
  3. Copy your AWS IAM role's Amazon Resource Name (ARN) to the IAM Role ARN field
  4. Copy your SSH public key to the SSH public key field
    • The SSH public key must be in OpenSSH format and it's private keypair can be used later to SSH onto every instance of every cluster you'll create with this credential.
    • The SSH username for the EC2 instances is cloudbreak.

Any other parameter is optional here.

Public in account means that all the users belonging to your account will be able to use this credential to create clusters, but cannot delete it.

Full size here.

Infrastructure Templates

After your AWS account is linked to Cloudbreak you can start creating resource templates that describe your clusters' infrastructure:

When you create one of the above resources, Cloudbreak does not make any requests to AWS. Resources are only created on AWS after the create cluster button has pushed. These templates are saved to Cloudbreak's database and can be reused with multiple clusters to describe the infrastructure.

Templates

Templates describe the instances of your cluster - the instance type and the attached volumes. A typical setup is to combine multiple templates in a cluster for the different types of nodes. For example you may want to attach multiple large disks to the datanodes or have memory optimized instances for Spark nodes.

The instance templates can be configured on the manage templates panel on the Cloudbreak Dashboard.

There are some optional configurations here as well:

Networks

Your clusters can be created in their own Virtual Private Cloud (VPC) or in one of your already existing VPCs. If you choose an existing VPC it is possible to create a new subnet within the VPC or use an already existing one. The subnet's IP range must be defined in the Subnet (CIDR) field using the general CIDR notation.

Default AWS Network

If you don't want to create or use your custom VPC, you can use the default-aws-network for all your Cloudbreak clusters. It will create a new VPC with a 10.0.0.0/16 subnet every time a cluster is created.

Custom AWS Network

If you'd like to deploy a cluster to a custom VPC you'll have to create a new network template on the manage networks panel.

You have the following options:

You can configure the Subnet Identifier and the Internet Gateway Identifier (IGW) of your VPC.

IMPORTANT: The subnet CIDR cannot overlap each other in a VPC. So you have to create different network templates for every each clusters.

To create a new subnet within the VPC, provide the ID of the subnet which is in the existing VPC and your cluster will be launched into that subnet. For example you can create 3 different clusters with 3 different network templates for multiple subnets 10.0.0.0/24, 10.0.1.0/24, 10.0.2.0/24 with the same VPC and IGW identifiers.

IMPORTANT: Make sure the define subnet here doesn't overlap with any of your already deployed subnet in the VPC, because the validation only happens after the cluster creation starts.

In case of existing subnet make sure you have enough room within your network space for the new instances.

If Public in account is checked all the users belonging to your account will be able to use this network template to create clusters, but cannot delete it.

NOTE: The VPCs, IGWs and subnet are created on AWS only after the the cluster provisioning starts with the selected network template.

Full size here.

Security groups

Security group templates are very similar to the security groups on the AWS Console. They describe the allowed inbound traffic to the instances in the cluster. Currently only one security group template can be selected for a Cloudbreak cluster and all the instances have a public IP address so all the instances in the cluster will belong to the same security group. This may change in a later release.

Default Security Group

You can also use the two pre-defined security groups in Cloudbreak.

only-ssh-and-ssl: all ports are locked down except for SSH and the selected Ambari Server HTTPS (you can't access Hadoop services outside of the VPC):

Custom Security Group

You can define your own security group by adding all the ports, protocols and CIDR range you'd like to use. The rules defined here doesn't need to contain the internal rules, those are automatically added by Cloudbreak to the security group on AWS.

Hadoop services : Ambari (8080) Consul (8500) NN (50070) RM Web (8088) Scheduler (8030RM) IPC (8050RM) Job history server (19888) HBase master (60000) HBase master web (60010) HBase RS (16020) HBase RS info (60030) Falcon (15000) Storm (8744) Hive metastore (9083) Hive server (10000) Hive server HTTP (10001) Accumulo master (9999) Accumulo Tserver (9997) Atlas (21000) KNOX (8443) Oozie (11000) Spark HS (18080) NM Web (8042) Zeppelin WebSocket (9996) Zeppelin UI (9995) Kibana (3080) * Elasticsearch (9200)

IMPORTANT 443 and 22 ports needs to be there in every security group otherwise Cloudbreak won't be able to communicate with the provisioned cluster

Use existing security group

Use this kind of security group if you have an existing security group and you'd like to apply same rules to each host group in a cluster.

If Public in account is checked all the users belonging to your account will be able to use this security group template to create clusters, but cannot delete it.

NOTE: The security groups are created on AWS only after the cluster provisioning starts with the selected security group template.

Full size here.

Defining Cluster Services

Blueprints

Blueprints are your declarative definition of a Hadoop cluster. These are the same blueprints that are used by Ambari.

You can use the 3 default blueprints pre-defined in Cloudbreak or you can create your own ones. Blueprints can be added from file, URL (an example blueprint) or the whole JSON can be written in the JSON text box.

The host groups in the JSON will be mapped to a set of instances when starting the cluster. Besides this the services and components will also be installed on the corresponding nodes. Blueprints can be modified later from the Ambari UI.

NOTE: It is not necessary to define all the configuration in the blueprint. If a configuration is missing, Ambari will fill that with a default value.

If Public in account is checked all the users belonging to your account will be able to use this blueprint to create clusters, but cannot delete or modify it.

Full size here.

A blueprint can be exported from a running Ambari cluster that can be reused in Cloudbreak with slight modifications. There is no automatic way to modify an exported blueprint and make it instantly usable in Cloudbreak, the modifications have to be done manually. When the blueprint is exported some configurations are hardcoded for example domain names, memory configurations...etc. that won't be applicable to the Cloudbreak cluster

Cluster Deployment

After all the cluster resources are configured you can deploy a new HDP cluster.

Here is a basic flow for cluster creation on Cloudbreak Web UI:

Configure Cluster tab

Setup Network and Security tab

Choose Blueprint tab

Review and Launch tab

Cloudbreak uses CloudFormation to create the resources - you can check out the resources created by Cloudbreak on the AWS Console CloudFormation page. Full size here.

Besides these you can check the progress on the Cloudbreak Web UI itself if you open the new cluster's Event History. Full size here.

Advanced Options

There are some advanced features when deploying a new cluster, these are the following:

Ambari Username This user will be used as admin user in Ambari. You can log in using this username on the Ambari UI.

Ambari Password The password associated with the Ambari username. This password will be also the default password for all required passwords which are not specified in the blueprint. E.g: hive DB password.

Availability Zone You can restrict the instances to a specific availability zone. It may be useful if you're using reserved instances.

Use dedicated instances You can use dedicated instances on EC2

Minimum cluster size The provisioning strategy in case the cloud provider cannot allocate all the requested nodes.

Validate blueprint This is selected by default. Cloudbreak validates the Ambari blueprint in this case.

Custom Image If you enable this, you can override the default image for provision.

Config recommendation strategy Strategy for how configuration recommendations will be applied. Recommended configurations gathered by the response of the stack advisor.

Instance Profile Cluster will be able to communicate with AWS api without any configuration.

Hostgroup Configuration During the hostgroup config we support different security groups per hostgroup.

Configure Ambari Database In case you have an existing DB (like RDS) you can reuse it

Cluster Termination

You can terminate running or stopped clusters with the terminate button in the cluster details.

IMPORTANT: Always use Cloudbreak to terminate the cluster. If that fails for some reason, try to delete the CloudFormation stack first. Instances are started in an Auto Scaling Group so they may be restarted if you terminate an instance manually!

Sometimes Cloudbreak cannot synchronize its state with the cluster state at the cloud provider and the cluster can't be terminated. In this case the Forced termination option can help to terminate the cluster at the Cloudbreak side. If it has happened:

  1. You should check the related resources at the AWS CloudFormation
  2. If it is needed you need to manually remove resources from there

Full size here.

Interactive mode / Cloudbreak Shell

The goal with the Cloudbreak Shell (Cloudbreak CLI) was to provide an interactive command line tool which supports:

Start Cloudbreak Shell

To start the Cloudbreak CLI use the following commands:

   cd cloudbreak-deployment
   cbd start
   cbd util cloudbreak-shell

At the very first time it will take for a while, because of need to download all the necessary docker images.

This will launch the Cloudbreak shell inside a Docker container then it is ready to use. Full size here.

IMPORTANT You have to copy all your files into the cbd working directory, what you would like to use in shell. For example if your cbd working directory is ~/cloudbreak-deployment then copy your blueprint JSON, public ssh key file...etc. to here. You can refer to these files with their names from the shell.

Autocomplete and Hints

Cloudbreak Shell helps you with hint messages from the very beginning, for example:

cloudbreak-shell>hint
Hint: Add a blueprint with the 'blueprint create' command or select an existing one with 'blueprint select'
cloudbreak-shell>

Beyond this you can use the autocompletion (double-TAB) as well:

cloudbreak-shell>credential create --
credential create --AWS          credential create --AZURE        credential create --EC2          credential create --GCP          credential create --OPENSTACK

Cluster Provisioning via CLI

Setting up AWS Credential

Cloudbreak works by connecting your AWS account through so called Credentials, and then uses these credentials to create resources on your behalf. Credentials can be configured with the following command for example:

credential create --AWS --name my-aws-credential --description "sample description" --roleArn 
arn:aws:iam::***********:role/userrole --sshKeyString "ssh-rsa AAAAB****etc"

NOTE: Cloudbreak does not set your cloud user details - we work around the concept of IAM - on Amazon (or other cloud providers). You should have already a valid IAM role. You can find further details here.

Alternatives to provide SSH Key:

You can check whether the credential was created successfully

credential list

You can switch between your existing credentials

credential select --name my-aws-credential

Infrastructure Templates

After your AWS account is linked to Cloudbreak you can start creating resource templates that describe your clusters' infrastructure:

When you create one of the above resources, Cloudbreak does not make any requests to AWS. Resources are only created on AWS after the cluster create has applied. These templates are saved to Cloudbreak's database and can be reused with multiple clusters to describe the infrastructure.

Templates

Templates describe the instances of your cluster - the instance type and the attached volumes. A typical setup is to combine multiple templates in a cluster for the different types of nodes. For example you may want to attach multiple large disks to the datanodes or have memory optimized instances for Spark nodes.

A template can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). Templates can be configured with the following command for example:

template create --AWS --name my-aws-template --description "sample description" --instanceType m4.large --volumeSize 
100 --volumeCount 2

Other available option here is --publicInAccount. If it is true, all the users belonging to your account will be able to use this template to create clusters, but cannot delete it.

You can check whether the template was created successfully

template list

Networks

Your clusters can be created in their own Virtual Private Cloud (VPC) or in one of your already existing VPCs. If you choose an existing VPC it is possible to create a new subnet within the VPC or use an already existing one. The subnet's IP range must be defined in the Subnet (CIDR) field using the general CIDR notation.

Default AWS Network

If you don't want to create or use your custom VPC, you can use the default-aws-network for all your Cloudbreak clusters. It will create a new VPC with a 10.0.0.0/16 subnet every time a cluster is created.

Custom AWS Network

If you'd like to deploy a cluster to a custom VPC you'll have to create a new network template, to create a new subnet within the VPC, provide the ID of the subnet which is in the existing VPC.

A network also can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack).

IMPORTANT The subnet CIDR cannot overlap each other in a VPC. So you have to create different network templates for every each clusters. For example you can create 3 different clusters with 3 different network templates for multiple subnets 10.0.0.0/24, 10.0.1.0/24, 10.0.2.0/24 with the same VPC and IGW identifiers.

network create --AWS --name my-aws-network --subnet 10.0.0.0/16

Other available options:

--vpcID your existing vpc on amazon

--internetGatewayID your amazon internet gateway of the given VPC

--publicInAccount If it is true, all the users belonging to your account will be able to use this network to create clusters, but cannot delete it.

You can check whether the network was created successfully

network list

Defining Cluster Services

Blueprints

Blueprints are your declarative definition of a Hadoop cluster. These are the same blueprints that are used by Ambari.

You can use the 3 default blueprints pre-defined in Cloudbreak or you can create your own ones. Blueprints can be added from file or URL (an example blueprint).

The host groups in the JSON will be mapped to a set of instances when starting the cluster. Besides this the services and components will also be installed on the corresponding nodes. Blueprints can be modified later from the Ambari UI.

NOTE: It is not necessary to define all the configuration in the blueprint. If a configuration is missing, Ambari will fill that with a default value.

blueprint create --name my-blueprint --description "sample description" --file <the path of the blueprint>

Other available options:

--url the url of the blueprint

--publicInAccount If it is true, all the users belonging to your account will be able to use this blueprint to create clusters, but cannot delete it.

You can check whether the blueprint was created successfully

blueprint list

A blueprint can be exported from a running Ambari cluster that can be reused in Cloudbreak with slight modifications. There is no automatic way to modify an exported blueprint and make it instantly usable in Cloudbreak, the modifications have to be done manually. When the blueprint is exported some configurations are hardcoded for example domain names, memory configurations..etc. that won't be applicable to the Cloudbreak cluster.

Metadata Show

You can check the stack metadata with

stack metadata --name myawsstack --instancegroup master

Other available options:

--id In this case you can select a stack with id.

--outputType In this case you can modify the outputformat of the command (RAW or JSON).

Cluster Deployment

After all the cluster resources are configured you can deploy a new HDP cluster. The following sub-sections show you a basic flow for cluster creation with Cloudbreak Shell.

Select Credential

Select one of your previously created AWS credential:

credential select --name my-aws-credential

Select Blueprint

Select one of your previously created blueprint which fits your needs:

blueprint select --name multi-node-hdfs-yarn

Configure Instance Groups

You must configure instance groups before provisioning. An instance group define a group of nodes with a specified template. Usually we create instance groups for host groups in the blueprint. For Ambari server only 1 host group can be specified. If you want to install the Ambari server to a separate node, you need to extend your blueprint with a new host group which contains only 1 service: HDFS_CLIENT and select this host group for the Ambari server. Note: this host group cannot be scaled so it is not advised to select a 'slave' host group for this purpose.

instancegroup configure --instanceGroup master --nodecount 1 --templateName minviable-aws --securityGroupName all-services-port --ambariServer true
instancegroup configure --instanceGroup slave_1 --nodecount 1 --templateName minviable-aws --securityGroupName all-services-port --ambariServer false

Other available option:

--templateId Id of the template

Select Network

Select one of your previously created network which fits your needs or a default one:

network select --name default-aws-network

Create Stack / Create Cloud Infrastructure

Stack means the running cloud infrastructure that is created based on the instance groups configured earlier (credential, instancegroups, network, securitygroup). Same as in case of the API or UI the new cluster will use your templates and by using CloudFormation will launch your cloud stack. Use the following command to create a stack to be used with your Hadoop cluster:

stack create --AWS --name myawsstack --region us-east-1

The infrastructure is created asynchronously, the state of the stack can be checked with the stack show command. If it reports AVAILABLE, it means that the virtual machines and the corresponding infrastructure is running at the cloud provider.

Other available option is:

--wait - in this case the create command will return only after the process has finished. --instanceProfileStrategy - strategy for seamless S3 connection. (CREATE, USE_EXISTING) --instanceProfile - If you selected 'USE_EXISTING' strategy then you should define the Instance Profile role which will be assigned to instances.

Create a Hadoop Cluster / Cloud Provisioning

You are almost done! One more command and your Hadoop cluster is starting! Cloud provisioning is done once the cluster is up and running. The new cluster will use your selected blueprint and install your custom Hadoop cluster with the selected components and services.

cluster create --description "my first cluster"

Other available option is --wait - in this case the create command will return only after the process has finished.

You are done! You have several opportunities to check the progress during the infrastructure creation then provisioning:

For example: Full size here.

         cluster show

For example: Full size here.

For example: Full size here.

Stop Cluster

You have the ability to stop your existing stack then its cluster if you want to suspend the work on it.

Select a stack for example with its name:

stack select --name my-stack

Other available option to define a stack is its --id.

Every time you should stop the cluster first then the stack. So apply following commands to stop the previously selected stack:

cluster stop
stack stop

Restart Cluster

Select your stack that you would like to restart after this you can apply:

stack start

After the stack has successfully restarted, you can restart the related cluster as well:

cluster start

Upscale Cluster

If you need more instances to your infrastructure, you can upscale your selected stack:

stack node --ADD --instanceGroup host_group_slave_1 --adjustment 6

Other available option is --withClusterUpScale - this indicates also a cluster upscale after the stack upscale. You can upscale the related cluster separately if you want to do this:

cluster node --ADD --hostgroup host_group_slave_1 --adjustment 6

Downscale Cluster

You also can reduce the number of instances in your infrastructure. After you selected your stack:

cluster node --REMOVE  --hostgroup host_group_slave_1 --adjustment -2

Other available option is --withStackDownScale - this indicates also a stack downscale after the cluster downscale. You can downscale the related stack separately if you want to do this:

stack node --REMOVE  --instanceGroup host_group_slave_1 --adjustment -2

Cluster Termination

You can terminate running or stopped clusters with

stack delete --name myawsstack

Other available option is --wait - in this case the terminate command will return only after the process has finished.

IMPORTANT: Always use Cloudbreak to terminate the cluster. If that fails for some reason, try to delete the CloudFormation stack first. Instances are started in an Auto Scaling Group so they may be restarted if you terminate an instance manually!

Sometimes Cloudbreak cannot synchronize its state with the cluster state at the cloud provider and the cluster can't be terminated. In this case the Forced termination option on the Cloudbreak Web UI can help to terminate the cluster at the Cloudbreak side. If it has happened:

  1. You should check the related resources at the AWS CloudFormation
  2. If it is needed you need to manually remove resources from there

Silent Mode

With Cloudbreak Shell you can execute script files as well. A script file contains shell commands and can be executed with the script cloudbreak shell command

script <your script file>

or with the cbd util cloudbreak-shell-quiet command

cbd util cloudbreak-shell-quiet < example.sh

IMPORTANT: You have to copy all your files into the cbd working directory, what you would like to use in shell. For example if your cbd working directory is ~/cloudbreak-deployment then copy your script file to here.

Example

The following example creates a Hadoop cluster with hdp-small-default blueprint on M4Xlarge instances with 2X100G attached disks on default-aws-network network using all-services-port security group. You should copy your ssh public key file into your cbd working directory with name id_rsa.pub and paste your AWS credentials in the parts with <...> highlight.

credential create --AWS --description description --name my-aws-credential --roleArn <arn role> --sshKeyPath id_rsa.pub
credential select --name my-aws-credential
template create --AWS --name awstemplate --description aws-template --instanceType m4.xlarge --volumeSize 100 
--volumeCount 2
blueprint select --name hdp-small-default
instancegroup configure --instanceGroup host_group_master_1 --nodecount 1 --templateName awstemplate --securityGroupName all-services-port --ambariServer true
instancegroup configure --instanceGroup host_group_master_2 --nodecount 1 --templateName awstemplate --securityGroupName all-services-port --ambariServer false
instancegroup configure --instanceGroup host_group_master_3 --nodecount 1 --templateName awstemplate --securityGroupName all-services-port --ambariServer false
instancegroup configure --instanceGroup host_group_client_1  --nodecount 1 --templateName awstemplate --securityGroupName all-services-port --ambariServer false
instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName awstemplate --securityGroupName all-services-port --ambariServer false
network select --name default-aws-network
stack create --AWS --name my-first-stack --region us-east-1 --wait true
cluster create --description "My first cluster" --wait true

Congratulations! Your cluster should now be up and running on this way as well. To learn more about Cloudbreak and provisioning, we have some interesting insights for you.

Edit on GitHub