Cassandra or Apache Cassandra is a distributed database system which manages large amounts of structured data across different commodity servers by providing highly available service with no point of failure.
Following are the steps of Installing Cassandra:
A Linux environment needs to be set using Secure Shell (ssh) before installing Cassandra. Following are the steps to set up Linux environment:
Initially, the Hadoop file system needs to be isolated by creating a separate user from the Unix file system. To create a user following are the steps:
- Use the command “su” to open root
- Use the command “useradd username” to create a user from the root account
- Use the command “su username” to open an existing user account
Use the following code to create a user in Linux terminal:
password: #useradd hadoop #passwd hadoop New password Retype new passwd
SSH Setup and Key Generation
On a cluster different operations such as starting, stopping, and distributed daemon shell operations require an SSH setup to be performed. It is essential to provide public/private key pair for a Hadoop user and share it with various users to authenticating different Hadoop users. For generating a key value pair using SSH following commands are used:
- Copy the public keys form id_rsa.pub to authorized_keys
- Provide owner
- Read and write permissions to authorized_keys file
$ ssh-keygen –t rsa $ cat ~/.ssh/id_rsa.pub>>~/.ssh/authorized_keys $ chmod 0600 ~/.ssh/authorized_keys
- Verify ssh:
Java being the main element for Cassandra it should be verified in the user’s system using the following command:
$ java –version
If all is in sync, then give the following output:
Java version "1.7.0_71"
Java(TM)SE Runtime Environment (build 1.7.0_71-b13)
Java Hotspot(TM)Client VM (build 25.0-b02,mixed mode)
Below are the steps to install Java if the user doesn’t have it in his system:
Download Java (JDK <latest version> -X64.tar.gz) ,then jdk-7u71-linux-x64.tar.gz will be downloaded onto your system.
The downloaded files can be found in the downloads folder,verify it and extract jdk-7u71-linux-x64.gz file using the commands:
$ cd Downloads/
$ tar zxf jdk-7u71-linux-x64.gz
Move java to the location “/usr/local/” to make it available to all users. Open root,and type the commands:
# mv jdk1.7.0_71 /usr/local/
Add the following commands to ~/.bashrc file to set up PATH and JAVA_HOME variables.
export JAVA_HOME =/usr/local/jdk1.7.0_71
export PATH = $PATH:$JAVA_HOME/bin
Then in the current running system apply all changes.
The following commands are used to configure Java alternatives.
#alternatives - - install/usr/bin/java java usr/local/java/bin/java 2
#alternatives- - install/usr/bin/javac javac usr/local/java/bin/javac 2
#alternatives- - install/usr/bin/jar jar usr/local/java/bin/jar 2
#alternatives--set java usr/local/java/bin/java
#alternatives--set javac usr/local/java/bin/javac
#alternatives--set jar usr/local/java/bin/jar
Now the Java –version is used from the terminal.
Setting the Path
Set path of the Cassandra path in “/.bahrc” as shown below:
[hadoop@linux~]$ gedit ~/.bashrc
Download Cassandra by following the given command. Once downloaded, unzip Cassandra and create a new folder named Cassandra and move the downloaded material in this folder.
$ mkdir Cassandra
$tar -zxvf apache-cassandra-2.1.2-bin.tar.gz
$ mv apache-cassandra-2.1.2/* Cassandra/
Open Cassandra.yaml from the Cassandra bin directory.
$ gedit cassandra.yaml
After verifying the configuration these values will be set to the specified directories.
- data_file_directories “/var/lib/cassandra/data”
- commitlog_directory “/var/lib/cassandra/commitlog”
- saved_caches_directory “/var/lib/cassandra/saved_caches”
Two directories /var/lib/cassandra and/var./lib/cassandra are created by super user in which Cassandra’s data is written.
[root@linux cassandra]# mkdir /var/lib/Cassandra
[root@linux cassandra]# mkdir /var/log/Cassandra
Permissions to folders
Read –write permissions are given to create folders.
[root@linux /]# chmod 777 /var/lib/Cassandra
[root@linux /]# chmod 777 /var/log/Cassandra
The user needs to open terminal window, go to Cassandra home directory/home, and run command to start Cassandra server.
$ cd $ CASSANDRA_HOME
-f option commands Cassandra to stay in the foreground instead of running as a background process. If no hurdle occurs then the user can see Cassandra server running.
The user needs to download the following jar lines to setup Cassandra programmatically.
These needed to be placed in a separate folder. Classpath needs to be set for this folder in “.bashrc”.
[hadoop@linux ~]$ gedit ~/.bashrc
//Set the following class path in the .bashrc file.
export CLASSPATH = $CLASSPATH:/home/hadoop/Cassandra_jars/*
About the Author:
Vaishnavi Agrawal loves pursuing excellence through writing and have a passion for technology. She has successfully managed and run personal technology magazines and websites. She is based out of Bangalore and has an experience of 5 years in the field of content writing and blogging. Her work has been published on various sites related to Hadoop Training, Big Data, Business Intelligence, Cloud Computing, IT, SAP, Project Management and more.