I:. Download and extract hive binary
hd@ubuntu:~$ wget https://www-eu.apache.org/dist/hive/hive-2.1.1/apache-hive-2.1.1-bin.tar.gz
hd@ubuntu:~$ tar xvf apache-hive-2.1.1-bin.tar.gz
hd@ubuntu:~$ sudo mv apache-hive-2.1.1-bin /usr/local/hive hd@ubuntu:~$ cd /usr/local
hd@ubuntu:~$ sudo chown -R hd:hadoop hive
2:. create a link between mysql connector library and apache hivehd@ubuntu:~$ tar xvf apache-hive-2.1.1-bin.tar.gz
hd@ubuntu:~$ sudo mv apache-hive-2.1.1-bin /usr/local/hive hd@ubuntu:~$ cd /usr/local
hd@ubuntu:~$ sudo chown -R hd:hadoop hive
hduser@ubuntu:~$ cd /usr/local/hive/lib
hduser@ubuntu:/usr/local/hive/lib $ ln -s /usr/share/java/mysql-connector-java.jar mysq-connector-java.jar
3:. Create metastore database and user for hive in this instance i used same hadoop user for hive however, hive could have separate user
4:. Apache hive configuration
Configure $HIVE_HOME/conf/hive-env.sh
Add or update HADOOP_HOME in this file
hduser@ubuntu:~$ cd /usr/local/hive/conf
hduser@ubuntu:/usr/local/hive/conf$ cp hive-env.sh.template hive-env.sh
hduser@ubuntu:/usr/local/hive/conf$ sudo vi hive-env.sh
--------------------------------------------------------------------
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/usr/local/hadoop
--------------------------------------------------------------------
Configure $HIVE_HOME/conf/hive-log4j2.properties
Update log location, default is /tmp.
--------------------------------------------------------------------
property.hive.log.dir = /usr/local/hive/logs/${sys:user.name}
--------------------------------------------------------------------
Configure $HIVE_HOME/conf/hive-site.xml
First 4 properties are connection properties for metastore
Next 2 properties ensure metastore schema is not updated post initialization
Next we set metastore thrift port, this is where hiveserver2 (and other clients) connects to metastore for information.
Next 2 properties enable concurrency
Rest are explained in the comments.
--------------------------------------------------------------------------
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore</value>
<description>the URL of the MySQL database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hduser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>elephant</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>
<property>
<name>hive.support.concurrency</name>
<description>Enable Hive's Table Lock Manager Service</description>
<value>true</value>
</property>
<property>
<name>datanucleus.autoStartMechanism</name>
<value>SchemaTable</value>
</property>
<property>
<name>hive.security.authorization.createtable.owner.grants</name>
<value>ALL</value>
<description>
The privileges automatically granted to the owner whenever a table gets created.
An example like "select,drop" will grant select and drop privilege to the owner
of the table. Note that the default gives the creator of a table no access to the
table (but see HIVE-8067).
</description>
</property>
<property>
<name>hive.warehouse.subdir.inherit.perms</name>
<value>false</value>
<description>
Set this to false if the table directories should be created
with the permissions derived from dfs umask instead of
inheriting the permission of the warehouse or database directory.
</description>
</property>
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
<description>enable or disable the Hive client authorization</description>
</property>
<property>
<name>hive.users.in.admin.role</name>
<value>hd,hduser</value>
<description>
Comma separated list of users who are in admin role for bootstrapping.
More users can be added in ADMIN role later.
</description>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<description>Zookeeper quorum used by Hive's Table Lock Manager</description>
<value>localhost:2181,localhost:2182</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10001</value>
<description>TCP port number to listen on, default 10000</description>
</property>
</configuration>
Add or update HADOOP_HOME in this file
hduser@ubuntu:~$ cd /usr/local/hive/conf
hduser@ubuntu:/usr/local/hive/conf$ cp hive-env.sh.template hive-env.sh
hduser@ubuntu:/usr/local/hive/conf$ sudo vi hive-env.sh
--------------------------------------------------------------------
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/usr/local/hadoop
--------------------------------------------------------------------
Configure $HIVE_HOME/conf/hive-log4j2.properties
Update log location, default is /tmp.
--------------------------------------------------------------------
property.hive.log.dir = /usr/local/hive/logs/${sys:user.name}
--------------------------------------------------------------------
Configure $HIVE_HOME/conf/hive-site.xml
First 4 properties are connection properties for metastore
Next 2 properties ensure metastore schema is not updated post initialization
Next we set metastore thrift port, this is where hiveserver2 (and other clients) connects to metastore for information.
Next 2 properties enable concurrency
Rest are explained in the comments.
--------------------------------------------------------------------------
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore</value>
<description>the URL of the MySQL database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hduser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>elephant</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>
<property>
<name>hive.support.concurrency</name>
<description>Enable Hive's Table Lock Manager Service</description>
<value>true</value>
</property>
<property>
<name>datanucleus.autoStartMechanism</name>
<value>SchemaTable</value>
</property>
<property>
<name>hive.security.authorization.createtable.owner.grants</name>
<value>ALL</value>
<description>
The privileges automatically granted to the owner whenever a table gets created.
An example like "select,drop" will grant select and drop privilege to the owner
of the table. Note that the default gives the creator of a table no access to the
table (but see HIVE-8067).
</description>
</property>
<property>
<name>hive.warehouse.subdir.inherit.perms</name>
<value>false</value>
<description>
Set this to false if the table directories should be created
with the permissions derived from dfs umask instead of
inheriting the permission of the warehouse or database directory.
</description>
</property>
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
<description>enable or disable the Hive client authorization</description>
</property>
<property>
<name>hive.users.in.admin.role</name>
<value>hd,hduser</value>
<description>
Comma separated list of users who are in admin role for bootstrapping.
More users can be added in ADMIN role later.
</description>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<description>Zookeeper quorum used by Hive's Table Lock Manager</description>
<value>localhost:2181,localhost:2182</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10001</value>
<description>TCP port number to listen on, default 10000</description>
</property>
</configuration>
5.Initialize metastore schema
hduser@ubuntu:~$ schematool -dbType mysql -initSchema
/usr/local/hive/conf/hive-env.sh: line 51: property.hive.log.dir: command not found
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://localhost/metastore
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hduser
Mon May 28 11:24:16 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.mysql.sql
Mon May 28 11:24:17 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Initialization script completed
Mon May 28 11:24:19 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
schemaTool completed
hduser@ubuntu:~$
/usr/local/hive/conf/hive-env.sh: line 51: property.hive.log.dir: command not found
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://localhost/metastore
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hduser
Mon May 28 11:24:16 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.mysql.sql
Mon May 28 11:24:17 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Initialization script completed
Mon May 28 11:24:19 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
schemaTool completed
hduser@ubuntu:~$
6.Create hdfs directories for hive
hduser@ubuntu:~$ hdfs dfs -mkdir /user/hive
hduser@ubuntu:~$ hdfs dfs -chmod 755 /user/hive
hduser@ubuntu:~$ hdfs dfs -mkdir /user/hive/warehouse
hduser@ubuntu:~$ hdfs dfs -chmod 1777 /user/hive/warehouse
hduser@ubuntu:~$ hdfs dfs -chown -R hduser:hadoop /user/hive
7.Run Hiveserver2 and Metastore
hduser@ubuntu:~$ $HIVE_HOME/bin/hive --service metastore & $HIVE_HOME/bin/hive --service hiveserver2
8.Run beeline to verify your installation
when you get above result hive data warehouse all set to roll












