Step by Step Procedure of Installing Cassandra

Cassandra or Apache Cassandra is a distributed database system which manages large amounts of structured data across different commodity servers by providing highly available service with no point of failure.

Following are the steps of Installing Cassandra:

Pre-Installation Setup

A Linux environment needs to be set using Secure Shell (ssh) before installing Cassandra. Following are the steps to set up Linux environment:

User Creation

Initially, the Hadoop file system needs to be isolated by creating a separate user from the Unix file system. To create a user following are the steps:

  • Use the command “su” to open root
  • Use the command “useradd username” to create a user from the root account
  • Use the command “su username” to open an existing user account

Use the following code to create a user in Linux terminal:

$su
password:
#useradd hadoop
#passwd hadoop
New password
Retype new passwd

SSH Setup and Key Generation

On a cluster different operations such as starting, stopping, and distributed daemon shell operations require an SSH setup to be performed. It is essential to provide public/private key pair for a Hadoop user and share it with various users to authenticating different Hadoop users. For generating a key value pair using SSH following commands are used:

  • Copy the public keys form id_rsa.pub to authorized_keys
  • Provide owner
  • Read and write permissions to authorized_keys file
$ ssh-keygen –t rsa
$ cat ~/.ssh/id_rsa.pub>>~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
  • Verify ssh:
ssh localhost

Java Installation

Java being the main element for Cassandra it should be verified in the user’s system using the following command:

$ java –version

If all is in sync, then give the following output:

Java version "1.7.0_71"
Java(TM)SE Runtime Environment (build 1.7.0_71-b13)
Java Hotspot(TM)Client VM (build 25.0-b02,mixed mode)

 

Below are the steps to install Java if the user doesn’t have it in his system:

Step1

Download Java (JDK <latest version> -X64.tar.gz) ,then jdk-7u71-linux-x64.tar.gz will be downloaded onto your system.

Step2

The downloaded files can be found in the downloads folder,verify it and extract jdk-7u71-linux-x64.gz file using the commands:

$ cd Downloads/
$ls
Jdk-7u71-linux-x64.gz
$ tar zxf jdk-7u71-linux-x64.gz
$ ls
Jdk1.7.0_71 jdk-7u71-linux-x64.gz

Step3

Move java to the location “/usr/local/” to make it available to all users. Open root,and type the commands:

$ su
password:
# mv jdk1.7.0_71 /usr/local/
# exit

Step 4

Add the following commands to ~/.bashrc file to set up PATH and JAVA_HOME variables.

export JAVA_HOME =/usr/local/jdk1.7.0_71
export PATH = $PATH:$JAVA_HOME/bin

Then in the current running system apply all changes.

$ source~/.bashrc

Step 5

The following commands are used to configure Java alternatives.

#alternatives - - install/usr/bin/java java usr/local/java/bin/java 2
#alternatives- - install/usr/bin/javac javac usr/local/java/bin/javac 2
#alternatives- - install/usr/bin/jar jar usr/local/java/bin/jar 2
#alternatives--set java usr/local/java/bin/java
#alternatives--set javac usr/local/java/bin/javac
#alternatives--set jar usr/local/java/bin/jar

Now the Java –version is used from the terminal.

Setting the Path

Set path of the Cassandra path in “/.bahrc” as shown below:

[hadoop@linux~]$ gedit ~/.bashrc
ExportCASSANDRA_HOME =~/Cassandra
Export CASSANDRA_HOME=~/cassandra

 

Download Cassandra

Download Cassandra by following the given command. Once downloaded, unzip Cassandra and create a new folder named Cassandra and move the downloaded material in this folder.

$wget http://superego.com/apache/cassandra/2.1.2/apache/apache-cassandra-2.1.2-bin.tar.gz
$ mkdir Cassandra
$tar -zxvf  apache-cassandra-2.1.2-bin.tar.gz
$ mv apache-cassandra-2.1.2/* Cassandra/

Configuring Cassandra

Open Cassandra.yaml from the Cassandra bin directory.

$ gedit cassandra.yaml

After verifying the configuration these values will be set to the specified directories.

  • data_file_directories “/var/lib/cassandra/data”
  • commitlog_directory “/var/lib/cassandra/commitlog”
  • saved_caches_directory “/var/lib/cassandra/saved_caches”

Directories Creation

Two directories /var/lib/cassandra and/var./lib/cassandra are created by super user in which Cassandra’s data is written.

[root@linux cassandra]# mkdir /var/lib/Cassandra
[root@linux cassandra]# mkdir /var/log/Cassandra

 

Permissions to folders

Read –write permissions are given to create folders.

[root@linux /]# chmod 777 /var/lib/Cassandra
[root@linux /]# chmod 777 /var/log/Cassandra

Start Cassandra

The user needs to open terminal window, go to Cassandra home directory/home, and run command to start Cassandra server.

$ cd $ CASSANDRA_HOME
$./bin/cassandra -f

-f option commands Cassandra to stay in the foreground instead of running as a background process. If no hurdle occurs then the user can see Cassandra server running.

Programming Environment

The user needs to download the following jar lines to setup Cassandra programmatically.

  • slf4j-api-1.7.5.jar
  • cassandra-driver-core-2.0.2.jar
  • guava-16.0.1.jar
  • metrics-core-3.0.2.jar
  • netty-3.9.0.Final.jar

These needed to be placed in a separate folder. Classpath needs to be set for this folder in “.bashrc”.

[hadoop@linux ~]$ gedit ~/.bashrc

//Set the following class path in the .bashrc file.

export CLASSPATH = $CLASSPATH:/home/hadoop/Cassandra_jars/*

About the Author:

Vaishnavi Agrawal loves pursuing excellence through writing and have a passion for technology. She has successfully managed and run personal technology magazines and websites. She is based out of Bangalore and has an experience of 5 years in the field of content writing and blogging. Her work has been published on various sites related to Hadoop Training, Big Data, Business Intelligence, Cloud Computing, IT, SAP, Project Management and more.