A simple walkthrough to get Hive up and running on Centos 7 using an existing Hadoop install. If you have not installed Hadoop please view my post to install Apache Hadoop on Centos 7 before continuing here.

As always, this is as much documentation for me as it is intended to be a tutorial but suggested corrections, additions and omissions are welcomed.

1. Download

Download the .tar.gz from the releases page for the mirror suggested to you.

For instance I’ve used:

cd /opt
wget http://ftp.wayne.edu/apache/hive/stable/apache-hive-1.2.1-bin.tar.gz

Un-gzip and un-tar:

tar zxvf apache-hive-1.2.1-bin.tar.gz

For 3rd party packages I live to use the naming convention `/opt//-' to allow for quick switching between version builds using the environment variables. Therefore I take this step as well:

mkdir hive/
mv apache-hive-1.2.1-bin/ hive/hive-1.2.1

2. Configuration

Add the environment variables:

vim  ~/.bashrc

Add at the bottom:

# Setting Hive Environment Variables
export HIVE_HOME=/opt/hive/hive-1.2.1
export PATH=$PATH:$HIVE_HOME/bin

Alter the Hive configuration script:

. ~/.bashrc
cd $HIVE_HOME/bin
vi hive-config.sh

Find these lines:

# Allow alternate conf dir location.

and add your Hadoop home below it:

export HADOOP_HOME=/opt/hadoop/hadoop-2.7.3

3. Build HDFS Directories

As the user hadoop:

hadoop fs -mkdir /tmp
hadoop fs -mkdir -p /user/hive/warehouse
hadoop fs -chmod g+w /tmp
hadoop fs -chmod g+w /user/hive/warehouse

You’re all set! You should be able to use both the CLI and the batch processing modes.