转载自
http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
Installing Java
Hadoop framework is written in Java!!
|
|
Adding a dedicated Hadoop user
|
|
Installing SSH
ssh has two main components:
- ssh : The command we use to connect to remote machines - the client.
- sshd : The daemon that is running on the server and allows clients to connect to the server.
The ssh
is pre-enabled on Linux, but in order to start sshd
daemon, we need to install ssh first. Use this command to do that :
|
|
This will install ssh on our machine. If we get something similar to the following, we can think it is setup properly:
|
|
Create and Setup SSH Certificates
|
|
The second command adds the newly created key to the list of authorized keys so that Hadoop can use ssh without prompting for a password.
We can check if ssh works:
|
|
Install Hadoop
|
|
We want to move the Hadoop installation to the /usr/local/hadoop directory using the following command:
|
|
这里的错误可以不烦,只要之前将hduser加到sudoer用户组中即可。
|
|
Now, the hduser has root priviledge, we can move the Hadoop installation to the /usr/local/hadoop directory without any problem:
|
|
Setup Configuration Files
The following files will have to be modified to complete the Hadoop setup:
- ~/.bashrc
- /usr/local/hadoop/etc/hadoop/hadoop-env.sh
- /usr/local/hadoop/etc/hadoop/core-site.xml
- /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
- /usr/local/hadoop/etc/hadoop/hdfs-site.xml
~/.bashrc
Before editing the .bashrc file in our home directory, we need to find the path where Java has been installed to set the JAVA_HOME environment variable using the following command:
|
|
Now we can append the following to the end of ~/.bashrc:
这里也可以用其他编辑器,就方便性而言还是vim而不是vi。
|
|
note that the JAVA_HOME should be set as the path just before the ‘…/bin/‘:
|
|
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
We need to set JAVA_HOME by modifying hadoop-env.sh files
|
|
/usr/local/hadoop/etc/hadoop/core-site.xml
The /usr/local/hadoop/etc/hadoop/core-site.xml file contains configuration properties that Hadoop uses when starting up.
This file can be used to override the default settings that Hadoop starts with.
|
|
Open the file and enter the following in between the
|
|
/usr/local/hadoop/etc/hadoop/mapred-site.xml
By default, the /usr/local/hadoop/etc/hadoop/ folder contains
/usr/local/hadoop/etc/hadoop/mapred-site.xml.template
file which has to be renamed/copied with the name mapred-site.xml:
|
|
The mapred-site.xml file is used to specify which framework is being used for MapReduce.
We need to enter the following content in between the
|
|
/usr/local/hadoop/etc/hadoop/hdfs-site.xml
The /usr/local/hadoop/etc/hadoop/hdfs-site.xml file needs to be configured for each host in the cluster that is being used.
It is used to specify the directories which will be used as the namenode and the datanode on that host.
Before editing this file, we need to create two directories which will contain the namenode and the datanode for this Hadoop installation.
This can be done using the following commands:
|
|
Open the file and enter the following content in between the
|
|
Format the New Hadoop Filesystem
Now, the Hadoop file system needs to be formatted so that we can start to use it. The format command should be issued with write permission since it creates current directory
under /usr/local/hadoop_store/hdfs/namenode folder:
|
|
Note that hadoop namenode -format command should be executed once before we start using Hadoop.
If this command is executed again after Hadoop has been used, it’ll destroy all the data on the Hadoop file system.
Starting Hadoop
Now it’s time to start the newly installed single node cluster.
We can use start-all.sh or (start-dfs.sh and start-yarn.sh)
|
|
We can check if it’s really up and running:
|
|
The output means that we now have a functional instance of Hadoop running on our VPS (Virtual private server).
Another way to check is using netstat:
|
|
Stopping Hadoop
|
|
We run stop-all.sh or (stop-dfs.sh and stop-yarn.sh) to stop all the daemons running on our machine:
|
|
Hadoop Web Interfaces
|
|
http://localhost:50070/ - web UI of the NameNode daemons
参考资料
[1] Hadoop 2.6 Installing on Ubuntu 14.04 (Single-Node Cluster) - 2015