Sunday, June 2, 2013

Installtion of HBase in fully distributed enviiornment

In this post we will see how to install HBase in fully distributed enviornment,before that we need to see all the component involved in fully distributed configuration of HBase.

HDFS:HDFS is a file system designed for storing very large files with streaming data access patterns, running clusters on commodity hardware.HDFS is highly fault-tolerant, with high throughput, suitable for applications with large data sets, streaming access to file system data and can be built out of commodity hardware.

HBase Master: HMaster is the implementation of the Master Server. The Master server is responsible for monitoring all RegionServer instances in the cluster, and is the interface for all metadata changes. In a distributed cluster, the Master typically runs on the namenode.

Region Servers:HRegionServer is the RegionServer implementation. It is responsible for serving and managing regions. In a distributed cluster, a RegionServer runs on a DataNode.

Zookeeper: A distributed Apache HBase (TM) installation depends on a running ZooKeeper cluster. All participating nodes and clients need to be able to access the running ZooKeeper ensemble. Apache HBase by default manages a ZooKeeper "cluster" for you. It will start and stop the ZooKeeper ensemble as part of the HBase start/stop process. You can also manage the ZooKeeper ensemble independent of HBase and just point HBase at the cluster it should use. To toggle HBase management of ZooKeeper, use the HBASE_MANAGES_ZK variable in conf/hbase-env.sh. This variable, which defaults to true, tells HBase whether to start/stop the ZooKeeper ensemble servers as part of HBase start/stop.

In the coming example we have 2 ubuntu machine image configured in VMPlayere,both are up and running hadoop,if you are facing trouble in configuring hadoop cluster you can fallow the post http://www.rajkrrsingh.blogspot.in/2013/06/install-and-configure-2-node-hadoop.html.

consider a senerio in which we have one master and 2 slave nodes.now on master edit the /etc/hosts file as fallows
127.0.0.1 localhost
192.168.92.128  master.hdcluster.com  master
192.168.92.129  regionserver1.hdcluster.com  regionserver1
192.168.92.130  regionserver2.hdcluster.com  regionserver2

#127.0.1.1 ubuntu

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

edit /etc/hosts file on one of the slave with 192.168.92.129 ip address
127.0.0.1 localhost
192.168.92.128  master.hdcluster.com  master
192.168.92.129  regionserver1.hdcluster.com  regionserver1

#127.0.1.1 ubuntu

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

edit /etc/hosts file on one of the slave with 192.168.92.130 ip address
127.0.0.1 localhost
192.168.92.128  master.hdcluster.com  master
192.168.92.130  regionserver2.hdcluster.com  regionserver2

#127.0.1.1 ubuntu

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Download HBase binaries on the Master machine and extract to the home folder.
Edit the /conf/hbase-env.sh file as fallows
export JAVA_HOME=/usr/lib/jvm/java-6-oracle
export HBASE_MANAGES_ZK=true


Now Edit the /conf/hbase-site.xml as fallows
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 * Copyright 2010 The Apache Software Foundation
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>
 <property> 
      <name>hbase.master</name> 
      <value>192.168.92.128:90000</value> 
 </property> 
 <property>
  <name>hbase.rootdir</name>
  <value>hdfs://master:54310/user/hbase</value>
 </property>

 <property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
 </property>

 <property>
  <name>hbase.zookeeper.qourum</name>
  <value>master,regionserver1,regionserver2</value>
 </property>

 <property>
  <name>hbase.zookeeper.property.datadir</name>
  <value>/home/rajkrrsingh/zookeeperdatadir</value>
 </property>

 <property>
  <name>hbase.zookeeper.property.clientPort</name>
  <value>2222</value>
 </property>
</configuration>

copy the HBase folder on the regionservers,that complete our cluster configuration you can start the cluster using start-hbase.sh command.