Installation of Cassandra Database for vCloud Director 9.5

With vCloud Director 9.5 it is possible to display performance metrics for VMs, such as CPU or memory usage, directly in the new Tenant Portal. However, this requires a Cassandra database cluster to store these performance metrics.

There are already guides out there on how to set up a Cassandra database cluster for vCloud Director, but the instructions I found were partly outdated or lacked information or were not suitable for CentOS 7, which I want to use as the base operating system. So I decided to create my own blogpost. And the VMware documentation about installing Cassandra for vCloud Director 9.5 doesn’t give a lot here and serves only as a rough orientation.

Let’s start with the basic information

Cassandra database clusters are a world of their own and I’ve only dealt with them superficially. As far as I understand it, you have to configure at least a replicator factor of 3 to allow a node to fail. A document of the vCloud Architecture Toolkit for Service Providers includes some more information on this. And as an initial setup, VMware specifies that you need at least 4 database nodes, i.e. 4 virtual machines, and 2 of these 4 nodes must be “seed nodes”.

I see this performance data as a nice-to-have for our customers, so I didn’t go into redundancy and replication factor any further. In addition, I have not used SSL encryption between client and node, as these performance metrics are not critical information for me.

You can also find a lot of different information about hardware sizing. For me, only the following information were relevant: Many CPUs are important for performance and a Cassandra database writes everything to memory and flush the information periodically to disk. Therefore, the memory should also not be too small.
If you have several thousand virtual machines and many customers, a node should have at least 12 vCPUs and probably between 32-48 GB RAM. But Cassandra is also perfectly designed to scale, so you can easily add more nodes to better distribute the read and write load.

In my setup I deployed 4 nodes (VMs) for startup and each node has 8 vCPUs, 16 GB memory and 1 TB of disk capacity.

The installation of the Cassandra database

I use CentOS 7 as my base operating system. So I assume that there are already 4 CentOS VMs.

First we need the Java Runtime Environment and install the package for the Java Native Access.


yum install -y java-1.8.0-openjdk jna

Then we add the JAVA_HOME variable to our environment variables.


echo 'JAVA_HOME="/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64/jre/bin/java"' >> /etc/environment
source /etc/environment

If you install a newer Java version you have to change the path.

After that, we create a repository file for the Cassandra repositoy:


vi /etc/yum.repos.d/cassandra.repo

With the following content:


[cassandra]
name=Apache Cassandra
baseurl=https://www.apache.org/dist/cassandra/redhat/311x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://www.apache.org/dist/cassandra/KEYS

And finally we can install the Cassandra database.


yum update
yum install -y cassandra cassandra-tools

The configuration of Cassandra

First we open the firewall ports so that the nodes can communicate with each other and also vCloud with these nodes.


firewall-cmd --zone public --add-port 7000/tcp --add-port 7001/tcp --add-port 7199/tcp --add-port 9042/tcp --add-port 9160/tcp --add-port 9142/tcp --permanent
firewall-cmd --reload

For Cassandra to work, we need to modify the Cassandra configuration file.


vi /etc/cassandra/default.conf/cassandra.yaml

And the following values must be changed:


cluster_name: 'vCD Performance Metrics Database'
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer

- seeds: "10.192.1.11,10.192.1.12"

listen_address: 10.192.1.11
rpc_address: 10.192.1.11

In my setup the first 2 nodes are the seed nodes (10.192.1.11 and 10.192.1.12). And except for the last two config options (listen_address and rpc_address) the settings must be the same on all nodes.

Starting the database cluster

Cassandra must still be started on each node, so that vCloud Director can use this database.


service cassandra start

Of course, we also want the service to start automatically at system boot.


chkconfig cassandra on

Once all services have been started on all CentOS 7 nodes, we can check if the cluster is working properly with the following command.


nodetool status

The output should look like this:

Additional database configurations

By default, Cassandra has a built-in superuser named “cassandra” and with the password “cassandra”. Of course we want to change that.

Therefore we first connect to the database cluster (from any node):


cqlsh 10.192.1.11 -u cassandra -p cassandra

And create a new user with a secure password and assign superuser rights to this user:


CREATE ROLE vcloud WITH SUPERUSER = true AND LOGIN = true AND PASSWORD = 'newSuperP@ssw0rd!';

After that, we log out (with the command “quit” and log in again with the new superuser:


cqlsh 10.192.1.11 -u vcloud -p 'newSuperP@ssw0rd!'

And change the password for the “cassandra” user to any value (we will never need it again) and remove the superuser role for this user:


ALTER ROLE cassandra WITH PASSWORD = 's0methingRe@llyC0mplexXx' AND SUPERUSER = false;
quit

That’s it. vCloud Director can now use the Cassandra database for storing performance metrics.

In my next blog post I will show the configuration of performance metrics in vCloud Director 9.5 and how we can use this new database.

Leave a Reply

Your email address will not be published. Required fields are marked *