Tuesday, November 8, 2016

Setting up Hadoop to run on Single Node in Ubuntu 15.04

This is tested on hadoop-2.7.3.

Improvement on Hadoop documentation : http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html

Step 1 

Make sure Java is installed

Installation instruction : http://suhothayan.blogspot.com/2010/02/how-to-set-javahome-in-ubuntu.html

Step 2

Install pre-requisites

$ sudo apt-get install ssh
$ sudo apt-get install rsync

Step 3

Setup Hadoop

$ gedit hadoop-2.7.3/etc/hadoop/core-site.xml

Add (replace {user-name} with system username, E.g "foo" for /home/foo/)

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
<name>hadoop.proxyuser.{user-name}.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.{user-name}.hosts</name>
        <value>*</value>
    </property>
</configuration>

$ gedit hadoop-2.7.3/etc/hadoop/hdfs-site.xml 

Add 

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

Step 4

Run

$ ssh localhost 

If it requested for password, run:

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys

Try ssh localhost again.
If it still asks for password, run following and try again:

$ ssh-keygen -t rsa
#Press enter for each line
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod og-wx ~/.ssh/authorized_keys 

Step 5

Clean namenode

$ ./hadoop-2.7.3/bin/hdfs namenode -format

Step 6 * Not provided in Hadoop Documentation 

Replace ${JAVA_HOME} with hardcoded path in hadoop-env.sh

gedit hadoop-2.7.3/etc/hadoop/hadoop-env.sh

Edit the file as 

# The java implementation to use.
export JAVA_HOME={path}/jdk1.8.0_111

Step 7

Start Hadoop 

$ ./hadoop-2.7.3/sbin/start-all.sh

The Hadoop daemon log output is written to the $HADOOP_LOG_DIR directory (defaults to $HADOOP_HOME/logs).

Browse the web interface for the NameNode;

http://localhost:50070/

Step 8

Check processors running by running:

$ jps

Output: 

xxxxx NameNode
xxxxx ResourceManager
xxxxx DataNode
xxxxx NodeManager
xxxxx SecondaryNameNode

Step 9

Make HDFS directories for MapReduce jobs:

$ ./hadoop-2.7.3/bin/hdfs dfs -mkdir /user
$ ./hadoop-2.7.3/bin/hdfs dfs -mkdir /user/{user-name}


13 comments:

  1. Thanks for sharing the information very useful info about Hadoop and

    keep updating us, Please........

    ReplyDelete
  2. Great post! I am actually getting ready to across this information, is very helpful my friend. Also great blog here with all of the valuable information you have Keep up the good work you are doing here.Well, got a good knowledge.

    Hadoop Training in Chennai

    Dot Net Training in Chennai

    ReplyDelete
  3. Very nice post here and thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.
    Hadoop Training in Chennai

    ReplyDelete
  4. I have seen a lot of blogs and Info. on other Blogs and Web sites But in this Hadoop Blog Information is useful very thanks for sharing it........

    ReplyDelete
  5. It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
    Android Training in Chennai
    Ios Training in Chennai

    ReplyDelete
  6. Before choosing a Job Oriented Training program it is important to evaluate your skills, interests, strength and weakness. Job Oriented Courses enable you to get a identity once you finish the same. Choose eNvent software Technology that suits you and make your career worthwhile.

    ReplyDelete
  7. I would suggest to take training from someone who is working in real time. Let me tell my story, I was working as a software engineer for a company. After 5 years it was very hard for me to move to other company as I my knowledge is very less. So I thought to change my platform to get new skills and new package in future. After continuous research I decided to take Hadoop training. So I googled on internet for best institute to learn Hadoop, shortlisted SV Soft Solutions institute and attended demo session, impressed to the trainer demo and joined. The course duration was 3 months. The trainer has great knowledge and he explained real time scenarios and taught real time project. I was able to clear my interview with great package. And finally moved to new company.
    You can also reach SV Soft Solutions http://www.svsoftsolutions.com,
    The trainer contact number is +1-845-915-8712, +91-9642373173

    ReplyDelete
  8. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I’ll be subscribing to your feed and I hope you post again soon.

    Base SAS Training in Chennai

    MSBI Training in Chennai

    ReplyDelete
  9. Besant Technologies in Bangalore is the BEST AWS training institute in Bangalore which offers complete AWS training in Bangalore by well experienced professionals having more than 10+ years of IT experience. AWS Training in Bangalore |
    AWS Training in Chennai |

    ReplyDelete
  10. Setting up Hadoop to run on Single Node in Ubuntu 15.04, As per the specification I have tried in Windows and executed Well As I being a PMP Certified and I got my PMP Training in Mumbai, I was well experienced with Hadoop projects and also have certain knowledge.

    ReplyDelete
  11. Really it was an awesome article...very interesting to read..You have provided an nice article....Thanks for sharing..
    Android Training in Chennai
    Ios Training in Chennai

    ReplyDelete