Sunday, June 3, 2018

Apache sqoop

 SQOOP CONFIGURATION

To install and configure apache sqoop server first step is to find most stable version that compatible with already configured Hadoop in your system
in my case hadoop 2.7.3
1. Download Sqoop

wget http://www-eu.apache.org/dist/sqoop/1.99.7/sqoop-1.99.7-bin-hadoop200.tar.gz

sudo tar -zxvf sqoop-1.99.7-bin-hadoop200.tar.gz
sudo mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha /usr/local/sqoop
sudo chown -R kui:hd sqoop
sudo chmod 777 -R /usr/local/sqoop
 2.Set Environment Variables
Go the $HOME/.bashrc to set the variables using sudo vi

export SQOOP_HOME=/usr/local/sqoop
export SQOOP_CONF_DIR=$SQOOP_HOME/conf
export SQOOP_CLASSPATH=$SQOOP_CONF_DIR
export PATH=$SQOOP_HOME/bin:$PATH
export PATH=$PATH:/usr/local/sqoop
export HADOOP_COMMON_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=/usr/local/hadoop
export HIVE_HOME=/usr/local/hive
export HBASE_HOME=/usr/local/hbase
3. Configuring Sqoop
For sqoop-env.sh the file have to be copied from sqoop-env-template.sh because initially itn't there
so use bellow to create one

cp sqoop-env-template.sh  sqoop-env.sh
 cd $SQOOP_HOME/conf
sudo vi  sqoop-env.sh
 export HADOOP_COMMON_HOME=/usr/local/hadoop
 export HADOOP_MAPRED_HOME=/usr/local/hadoop
4.Check Sqoop Version
$ sqoop version

5. create folder in /usr/lib
sudo mkdir /usr/lib/sqoop

6. Put DB-Connection-Library Into $SQOOP_HOME/lib
 To place mysql-connector library into sqoop home to communicate with 
wget http://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-5.1.45.tar.gz
sudo tar -zxvf mysql-connector-java-5.1.45.tar.gz
sudo cp mysql-connector-java-5.1.45/mysql-connector-java-5.1.45-bin.jar /usr/local/sqoop/lib
finally test sqoop can connect to MYSQL  and how to list existing databases in mysql


Now how to work with sqoop --- next
Apache sqoop is mastermind behind data movement between relational databases and Hadoop
 this true in case of hadoop data warehouse or apache hive you can do
i- direct ingestion to hdfs
ii- dirct ingestion to hive data warehouse
iii- from hdfs to hive
in all cases you should receive intended output

No comments:

Post a Comment

How to connect R with Apache spark

R interface  Step1. Install R-Base we begin with installation of R base programming language by simply dropping few line into terminal a...