搭建大数据处理集群(Hadoop,Spark,Hbase)
搭建Hadoop集群配置每台机器的 /etc/hosts保证每台机器之间可以互访。 1、创建hadoop用户 JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
PATH=$JAVA_HOME/bin:$PATH
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME
export PATH
export CLASSPATH
完成配置 2、创建用户目录 3、配置ssh为无密码登录 cd /home/hadoop
ssh-keygen -t rsa
一路回车,产生一个隐藏文件夹.ssh cd .ssh
通过ls 可以查看生成的文件 cp id_rsa.pub authorized_keys
现在测试一下 4、复制authorized_keys到其它节点上。 scp authorized_keys secondMaster:/home/hadoop/.ssh/
这里会提示要输入密码,输入hadoop账号密码就可以了。 chmod 644 authorized_keys
测试 ssh secondMaster 5、集群配置 core-site.xml <configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
hdfs-site.xml <configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
mapred-site.xml <configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
yarn-site.xml <configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
配置 slaves文件 切换到hadoop用户 su hadoop 创建目录 mkdir tmp
mkdir name
mkdir data
把hadoop复制到其它节点上去。 scp -r ./hadoop secondMaster:/home/hadoop
格式化分布式文件系统 cd hadoop
./bin/hdfs namenode -format
启动hadoop ./sbin/start-dfs.sh
此时,master主机上面运行的进程有:namenode,secondarynamenode ./sbin/start-yarn.sh
此时,master主机上面运行的进程有:namenode,secondarynamenode,resourcemanager http://master:50070/dfshealth.html#tab-overview 配置Spark集群1、编辑配置文件spark-env.sh,在此脚本最后一行加入以下行 export SPARK_DIST_CLASSPATH=$(/home/hadoop/hadoop-2.6.1/bin/hadoop classpath)
其中,/home/hadoop/hadoop-2.6.1是Hadoop安装目录 2、接下来编辑conf/slaves文件, secondMaster 然后把配置好的整个Spark复制到其他节点,如secondMaster scp -r spark-1.6.0-bin-hadoop2.4/ secondMaster:/home/hadoop/
配置Hbase集群1、配置hbase-site.xml <configuration>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,secondMaster</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
<description>Property from ZooKeeper config zoo.cfg.
The directory where the snapshot is stored.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed ZooKeeper
true: fully-distributed with unmanaged ZooKeeper Quorum (see hbase-env.sh)
</description>
</property>
</configuration>
2、配置hbase-env.sh中的JAVA_HOME和HBASE_HEAPSIZE # export JAVA_HOME=/usr/java/jdk1.6.0/
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
# export HBASE_HEAPSIZE=1G
export HBASE_HEAPSIZE=4G
3、配置regionservers文件 secondMaster 4、创建zookeeper目录 su hadoop
cd
mkdir zookeeper
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |