Quantcast
Viewing all articles
Browse latest Browse all 10

Setting up Hive

As I said earlier, Apache Hive is an open-source data warehouse infrastructure built on top of Hadoop for providing data summary, query, and analyzing large datasets stored in Hadoop files, it is developed by Facebook and it provides

  • Tools to enable easy data extract/transform/load (ETL)
  • A mechanism to impose structure on a variety of data formats
  • Access to files stored either directly in Apache HDFSTM or in other data storage systems such as Apache HBase
  • Query execution via MapReduce

Image may be NSFW.
Clik here to view.
hive

In this post we will get to know about, how to setup Hive on top of Hadoop cluster

Objective

The objective of this tutorial is for setting up Hive and running HiveQL scripts.

Prerequisites

The following are the prerequisites for setting up Hive.

You should have the latest stable build of Hadoop up and running, to install hadoop, please check my previous blog article on Hadoop Setup.

Setting up Hive:

Procedure

1. Download a stable version of the hive file from apache download mirrors,  For this tutorial we are using Hive-0.12.0,this release works with Hadoop 0.20.X, 1.X, 0.23.X and 2.X

wget http://apache.osuosl.org/hive/hive-0.12.0/hive-0.12.0.tar.gz

Image may be NSFW.
Clik here to view.
HiveWget1

2. Unpack the compressed hive in home directory:

tar xvzf hive-0.12.0.tar.gz

Image may be NSFW.
Clik here to view.
HiveUnpack2

Image may be NSFW.
Clik here to view.
HiveExtractstep3

Image may be NSFW.
Clik here to view.
HiveExtractstep4

3. Create a  hive directory under usr/local directory as root user and change the ownership to hduser as shown, this is for our convenience to differentiate each framework,software and application with different users.

cd /usr/local
mkdir hive
sudo chown -R hduser:hadoop /usr/local/hive

Image may be NSFW.
Clik here to view.
Hive_ChangeOwner_sudo5

4. Login as  hduser and move the uncompressed hive-0.12.0 to /usr/local/hive folder

mv hive-0.12.0/ /usr/local/hive

 Image may be NSFW.
Clik here to view.
Hive_Move_cd_6

5. set HIVE_HOME in $HOME/.bashrc so it will be set every time you login.

$ .bashrc

Add the following entries to the .bashrc file.

export HIVE_HOME='/usr/local/hive/hive-0.12.0'
export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin:PATH

 Image may be NSFW.
Clik here to view.
Hive_Export_path_07

7. compile .bashrc file using this command:

. .bashrc

 Image may be NSFW.
Clik here to view.
Hive_Compile_bashrc_08

Setting up hive on top of hadoop has takencare, lets test it:

8. Start hive by executing the following command.

hive

9. table in hive by the following command. Also after creating check if the table exists.

create table test (field1 string, field2 string);
show tables;

 Image may be NSFW.
Clik here to view.
Hive_HQL_09

10. Show extended details on the table

Describe extended test;

 Image may be NSFW.
Clik here to view.
Hive_Describe_hdfs_10

By this output we know that hive was setup correctly on top of Hadoop cluster, it’s time to learn the HiveQL.


Viewing all articles
Browse latest Browse all 10

Trending Articles