1. 安装前的准备:
在进行Hadoop伪分布式安装前要检测虚拟机是否进行了下面的配置:
1.1 修改主机名:
临时修改:hostname 主机名
永久修改: vi /etc/sysconfig/network
hostname=主机名
1.2 修改主机名与ip之间的映射关系
vi /etc/hosts
添加:ip + 主机名 #使用hostname -i 可以打印出ip
1.3 配置虚拟机网络可以进行上网
2. 创建hadoop 用户来管理hadoop:
创建:
[root@rzdatahadoop002 software]# useradd hadoop
[root@rzdatahadoop002 software]# id hadoop
uid=515(hadoop) gid=515(hadoop) groups=515(hadoop)
给用户添加sudo权限:
[root@rzdatahadoop002 software]# vi /etc/sudoers
hadoop ALL=(root) NOPASSWD:ALL
3. 安装java
安装过程见编译过程
查看java:
[root@rzdatahadoop002 jdk1.8.0_45]# which java
/usr/java/jdk1.8.0_45/bin/java
4. 解压已经编译好的hadoop
移动:
将在/opt/sourcecode编译好的 hadoop-2.8.1.tar.gz 移动到/opt/software目录下
mv hadoop-2.8.1.tar.gz /opt/software
解压tar包:
[root@rzdatahadoop002 software]# tar -xzvf hadoop-2.8.1.tar.gz
变更用户,用户组,设置软连接:
[root@rzdatahadoop002 software]#chown -R hadoop:hadoop hadoop-2.8.1
[root@rzdatahadoop002 software]#ln -s /opt/software/hadoop-2.8.1 hadoop
[root@rzdatahadoop002 software]#chown -R hadoop:hadoop hadoop
注:若先设置软连接,则需要使用三次chown命令
查看:
[root@rzdatahadoop002 software]# cd hadoop
[root@rzdatahadoop002 hadoop]# rm -f *.txt
[root@rzdatahadoop002 hadoop]# ll
total 28
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 bin
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 etc
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 include
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 lib
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 libexec
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 sbin
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 share
注:
bin: 命令
etc:配置文件
sbin: 用来启动关闭hadoop相关进程
5. hadoop进程相关配置文件:
切换hadoop用户:
此时,进入到etc目录下的hadoop目录,查看目录,会出现以下几个主要配置文件:
- hadoop-env.sh : hadoop配置环境
- core-site.xml : hadoop 核心配置文件
- hdfs-site.xml : hdfs服务的配置文件 –> 会起进程
- mapred-site.xml : mapred计算所需要的配置文件 只当在jar计算时才有
- yarn-site.xml : yarn服务的配置文件 –> 会起进程
- slaves: 集群的机器名称
6. 配置hadoop用户的ssh协议:
7. 配置hdfs相关文件:
配置核心文件:
[hadoop@rzdatahadoop002 hadoop]$ vi core-site.xml #配置相关配置文件时,须到etc目录下的hadoop目录。
配置hdfs文件:
8. 格式化namenode(元数据节点)
[hadoop@rzdatahadoop002 hadoop]$ bin/hdfs namenode -format #bin命令目录下执行命令,若在bin目录下,则用./命令来执行
1. 17/12/13 22:22:04 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
2. 17/12/13 22:22:04 INFO namenode.FSImageFormatProtobuf: Saving image file /tmp/hadoop-hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
3. 17/12/13 22:22:04 INFO namenode.FSImageFormatProtobuf: Image file /tmp/hadoop-hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds.
4. 17/12/13 22:22:04 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
5. 17/12/13 22:22:04 INFO util.ExitUtil: Exiting with status 0
6. 17/12/13 22:22:04 INFO namenode.NameNode: SHUTDOWN_MSG:
7. /**************************************************
8. SHUTDOWN_MSG: Shutting down NameNode at rzdatahadoop002/192.168.137.201
9. **************************************************/
9. 启动前准备及启动hdfs:
准备(配置java到hadoop环境):
[hadoop@rzdatahadoop002 sbin]$ vi ../etc/hadoop/hadoop-env.sh
The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_45 #若没有将java配置到hadoop环境则无法启动hdfs服务
启动:
[hadoop@rzdatahadoop002 sbin]$ ./start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-rzdatahadoop002.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-rzdatahadoop002.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-rzdatahadoop002.out
10. 修改:
修改hdfs localhost 为对应ip:
[hadoop@rzdatahadoop002 bin]$ ../sbin/stop-dfs.sh
[hadoop@rzdatahadoop002 bin]$ vi ../etc/hadoop/core-site.xml
- 修改完成后要对namenode进行重新格式化,然后重新启动hdfs
修改datanode以主机名登陆:
[hadoop@rzdatahadoop002 hadoop]$ vi slaves
rzdatahadoop002
修改Secondarynamenode以主机名登陆:
[hadoop@rzdatahadoop002 hadoop]$ vi hdfs-site.xml
- 重新启动HDFS服务
[hadoop@rzdatahadoop002 sbin]$ ./stop-dfs.sh
[hadoop@rzdatahadoop002 sbin]$ ./start-dfs.sh
Starting namenodes on [rzdatahadoop002]
rzdatahadoop002: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-rzdatahadoop002.out
rzdatahadoop002: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-rzdatahadoop002.out
Starting secondary namenodes [rzdatahadoop002]
rzdatahadoop002: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-rzdatahadoop002.out
11. Yarn 部署:
修改mapred-site.xml 文件:
[hadoop@zydatahadoop001 hadoop]$ cd /opt/software/hadoop/etc/hadoop/
[hadoop@zydatahadoop001 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@zydatahadoop001 hadoop]$ vi mapred-site.xml
修改yarn-site.xml文件:
[root@rzdatahadoop002 hadoop]# vi yarn-site.xml
开启服务:
[hadoop@zydatahadoop001 hadoop]$ cd /opt/software/hadoop
[hadoop@zydatahadoop001 hadoop]$ sbin/start-yarn.sh
- starting yarn daemons
- starting resourcemanager, logging to /opt/software/hadoop-2.8.1/logs/yarn-hadoop-resourcemanager- zydatahadoop001.out
- zydatahadoop001: starting nodemanager, logging to /opt/software/hadoop-2.8.1/logs/yarn-hadoop-nodemanager-zydatahadoop001.out
- [hadoop@zydatahadoop001 hadoop]$ jps
- 24439 ResourceManager
- 24840 Jps
- 24073 SecondaryNameNode
- 24539 NodeManager
- 23788 NameNode
查看8088端口:
[hadoop@zydatahadoop001 hadoop]$ netstat -nlp|grep 8088
(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)
tcp 0 0 :::8088 :::* LISTEN 24439/java
- 进入wed界面:http://192.168.137.200:8088/
MR Job测试:
MapReduce: java代码写的,map(映射)函数和reduce(归约)函数
[hadoop@rzdatahadoop002 hadoop]$ find ./ -name “example”
./share/hadoop/mapreduce/lib-examples
./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar
./share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.8.1-sources.jar
./share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.8.1-test-sources.jar
./lib/native/examples
./etc/hadoop/ssl-client.xml.example
./etc/hadoop/ssl-server.xml.example
Yarn停止:
[hadoop@zydatahadoop001 hadoop]$ sbin/stop-yarn.sh
来自@若泽大数据