中文字幕av专区_日韩电影在线播放_精品国产精品久久一区免费式_av在线免费观看网站

溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務條款》

如何完全分布式安裝Hadoop

發布時間:2021-11-12 14:01:37 來源:億速云 閱讀:149 作者:小新 欄目:云計算

小編給大家分享一下如何完全分布式安裝Hadoop,相信大部分人都還不怎么了解,因此分享這篇文章給大家參考一下,希望大家閱讀完這篇文章后大有收獲,下面讓我們一起去了解一下吧!

  Hadoop完全分布式模式安裝步驟

  Hadoop模式介紹

  單機模式:安裝簡單,幾乎不用作任何配置,但僅限于調試用途

  偽分布模式:在單節點上同時啟動namenode、datanode、jobtracker、tasktracker、secondary namenode等5個進程,模擬分布式運行的各個節點

  完全分布式模式:正常的Hadoop集群,由多個各司其職的節點構成

  安裝環境

  操作平臺:vmware2

  操作系統:oracle linux 5.6

  軟件版本:hadoop-0.22.0,jdk-6u18

  集群架構:3 node,master node(gc),slave node(rac1,rac2)

  安裝步驟

  1.        下載Hadoop和jdk:

  如:hadoop-0.22.0

  如何完全分布式安裝Hadoop

  2.        配置hosts文件

  所有的節點(gc,rac1,rac2)都修改/etc/hosts,使彼此之間都能把主機名解析為ip

  [root@gc ~]$ cat /etc/hosts

  # Do not remove the following line, or various programs

  # that require network functionality will fail.

  127.0.0.1               localhost.localdomain localhost

  ::1             localhost6.localdomain6 localhost6

  192.168.2.101           rac1.localdomain rac1

  192.168.2.102           rac2.localdomain rac2

  192.168.2.100           gc.localdomain gc

  3.        建立hadoop運行賬號

  在所有的節點創建hadoop運行賬號

  [root@gc ~]# groupadd hadoop

  [root@gc ~]# useradd -g hadoop grid --注意此處一定要指定分組,不然可能會不能建立互信

  [root@gc ~]# id grid

  uid=501(grid) gid=54326(hadoop) groups=54326(hadoop)

  [root@gc ~]# passwd grid

  Changing password for user grid.

  New UNIX password:

  BAD PASSWORD: it is too short

  Retype new UNIX password:

  passwd: all authentication tokens updated successfully.

  4.        配置ssh免密碼連入

  注意要以hadoop用戶登錄,在hadoop用戶的主目錄下進行操作。

  每個節點做下面相同的操作

  [hadoop@gc ~]$ ssh-keygen -t rsa

  Generating public/private rsa key pair.

  Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):

  Created directory '/home/hadoop/.ssh'.

  Enter passphrase (empty for no passphrase):

  Enter same passphrase again:

  Your identification has been saved in /home/hadoop/.ssh/id_rsa.

  Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.

  The key fingerprint is:

  54:80:fd:77:6b:87:97:ce:0f:32:34:43:d1:d2:c2:0d hadoop@gc.localdomain

  [hadoop@gc ~]$ cd .ssh

  [hadoop@gc .ssh]$ ls

  id_rsa  id_rsa.pub

  把各個節點的authorized_keys的內容互相拷貝加入到對方的此文件中,然后就可以免密碼彼此ssh連入。

  在其中一節點(gc)節點就可完成操作

  [hadoop@gc .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

  [hadoop@gc .ssh]$ ssh rac1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

  The authenticity of host 'rac1 (192.168.2.101)' can't be established.

  RSA key fingerprint is 19:48:e0:0a:37:e1:2a:d5:ba:c8:7e:1b:37:c6:2f:0e.

  Are you sure you want to continue connecting (yes/no)  yes

  Warning: Permanently added 'rac1,192.168.2.101' (RSA) to the list of known hosts.

  hadoop@rac1's password:

  [hadoop@gc .ssh]$ ssh rac2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

  The authenticity of host 'rac2 (192.168.2.102)' can't be established.

  RSA key fingerprint is 19:48:e0:0a:37:e1:2a:d5:ba:c8:7e:1b:37:c6:2f:0e.

  Are you sure you want to continue connecting (yes/no)  yes

  Warning: Permanently added 'rac2,192.168.2.102' (RSA) to the list of known hosts.

  hadoop@rac2's password:

  [hadoop@gc .ssh]$ scp ~/.ssh/authorized_keys rac1:~/.ssh/authorized_keys

  hadoop@rac1's password:

  authorized_keys                                                                                                            100% 1213     1.2KB/s   00:00   

  [hadoop@gc .ssh]$ scp ~/.ssh/authorized_keys rac2:~/.ssh/authorized_keys

  hadoop@rac2's password:

  authorized_keys                                                                                                            100% 1213     1.2KB/s   00:00   

  [hadoop@gc .ssh]$ ll

  總計 16

  -rw-rw-r-- 1 hadoop hadoop 1213 10-30 09:18 authorized_keys

  -rw------- 1 hadoop hadoop 1675 10-30 09:05 id_rsa

  -rw-r--r-- 1 hadoop hadoop  403 10-30 09:05 id_rsa.pub

  --分別測試連接

  [grid@gc .ssh]$ ssh rac1 date

  2012年 11月 18日星期日 01:35:39 CST

  [grid@gc .ssh]$ ssh rac2 date

  2012年 10月 30日星期二 09:52:46 CST

  --可以看到這步和配置oracle RAC中使用 SSH 建立用戶等效性步驟是一樣的。

  5.        解壓hadoop安裝包

  --可先一某節點解壓配置文件

  [grid@gc ~]$ ll

  總計 43580

  -rw-r--r-- 1 grid hadoop 44575568 2012-11-19 hadoop-0.20.2.tar.gz

  [grid@gc ~]$ tar xzvf /home/grid/hadoop-0.20.2.tar.gz

  [grid@gc ~]$ ll

  總計 43584

  drwxr-xr-x 12 grid hadoop     4096 2010-02-19 hadoop-0.20.2

  -rw-r--r--  1 grid hadoop 44575568 2012-11-19 hadoop-0.20.2.tar.gz

  --在各節點安裝jdk

  [root@gc ~]# ./jdk-6u18-linux-x64-rpm.bin

  6.         Hadoop配置有關文件

  如何完全分布式安裝Hadoop

  n       配置hadoop-env.sh

  [root@gc conf]# pwd

  /root/hadoop-0.20.2/conf

  --修改jdk安裝路徑

  [root@gc conf]vi hadoop-env.sh

  export JAVA_HOME=/usr/java/jdk1.6.0_18

  n       配置namenode,修改site文件

  --修改core-site.xml文件

  [gird@gc conf]# vi core-site.xml

  < xml version="1.0" >

  < xml-stylesheet type="text/xsl" href="configuration.xsl" >

  <!-- Put site-specific property overrides in this file. -->

  <configuration>

  <property>

  <name>fs.default.name</name>

  <value>hdfs://192.168.2.100:9000</value> --注意完全分布模式此地一定要用IP,下同

  </property>

  </configuration>

  注:fs.default.name NameNode的IP地址和端口

  --修改hdfs-site.xml文件

  [gird@gc conf]# vi hdfs-site.xml

  < xml version="1.0" >

  < xml-stylesheet type="text/xsl" href="configuration.xsl" >

  <!-- Put site-specific property overrides in this file. -->

  <configuration>

  <property>

  <name>dfs.data.dir</name>

  <value>/home/grid/hadoop-0.20.2/data</value> --注意此目錄必需已經創建并能讀寫

  </property>

  <property>

  <name>dfs.replication</name>

  <value>2</value>

  </property>

  </configuration>

  hdfs-site.xml文件中常用配置參數:

  如何完全分布式安裝Hadoop

  --修改mapred-site.xml文件

  [gird@gc conf]# vi mapred-site.xml

  < xml version="1.0" >

  < xml-stylesheet type="text/xsl" href="configuration.xsl" >

  <!-- Put site-specific property overrides in this file. -->

  <configuration>

  <property>

  <name>mapred.job.tracker</name>

  <value>192.168.2.100:9001</value>

  </property>

  </configuration>

  mapred-site.xml文件中常用配置參數

  如何完全分布式安裝Hadoop

  n       配置masters和slaves文件

  [grid@gc conf]$ vi masters

  gc

  [grid@gc conf]$ vi slaves

  rac1

  rac2

  n       向各節點復制hadoop

  --把gc主機上面hadoop配置好文件分別copy到各節點

  --注意:復制到其它的節點后配置文件中要修改為此節點的IP

  [grid@gc conf]$ scp -r hadoop-0.20.2 rac1:/home/grid/

  [grid@gc conf]$ scp -r hadoop-0.20.2 rac2:/home/grid/

  7.         格式化namenode

  --分別在各節點進行格式化

  [grid@rac2 bin]$ pwd

  /home/grid/hadoop-0.20.2/bin

  [grid@gc bin]$ ./hadoop namenode –format

  12/10/31 08:03:31 INFO namenode.NameNode: STARTUP_MSG:

  /************************************************************

  STARTUP_MSG: Starting NameNode

  STARTUP_MSG:   host = gc.localdomain/192.168.2.100

  STARTUP_MSG:   args = [-format]

  STARTUP_MSG:   version = 0.20.2

  STARTUP_MSG:   build = ; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010

  ************************************************************/

  12/10/31 08:03:31 INFO namenode.FSNamesystem: fsOwner=grid,hadoop

  12/10/31 08:03:31 INFO namenode.FSNamesystem: supergroup=supergroup

  12/10/31 08:03:31 INFO namenode.FSNamesystem: isPermissionEnabled=true

  12/10/31 08:03:32 INFO common.Storage: Image file of size 94 saved in 0 seconds.

  12/10/31 08:03:32 INFO common.Storage: Storage directory /tmp/hadoop-grid/dfs/name has been successfully formatted.

  12/10/31 08:03:32 INFO namenode.NameNode: SHUTDOWN_MSG:

  /************************************************************

  SHUTDOWN_MSG: Shutting down NameNode at gc.localdomain/192.168.2.100

  ************************************************************/

  8.         啟動hadoop

  --在master節點啟動hadoop守護進程

  [grid@gc bin]$ pwd

  /home/grid/hadoop-0.20.2/bin

  [grid@gc bin]$ ./start-all.sh

  starting namenode, logging to /home/grid/hadoop-0.20.2/bin/../logs/hadoop-grid-namenode-gc.localdomain.out

  rac2: starting datanode, logging to /home/grid/hadoop-0.20.2/bin/../logs/hadoop-grid-datanode-rac2.localdomain.out

  rac1: starting datanode, logging to /home/grid/hadoop-0.20.2/bin/../logs/hadoop-grid-datanode-rac1.localdomain.out

  The authenticity of host 'gc (192.168.2.100)' can't be established.

  RSA key fingerprint is 8e:47:42:44:bd:e2:28:64:10:40:8e:b5:72:f9:6c:82.

  Are you sure you want to continue connecting (yes/no)  yes

  gc: Warning: Permanently added 'gc,192.168.2.100' (RSA) to the list of known hosts.

  gc: starting secondarynamenode, logging to /home/grid/hadoop-0.20.2/bin/../logs/hadoop-grid-secondarynamenode-gc.localdomain.out

  starting jobtracker, logging to /home/grid/hadoop-0.20.2/bin/../logs/hadoop-grid-jobtracker-gc.localdomain.out

  rac2: starting tasktracker, logging to /home/grid/hadoop-0.20.2/bin/../logs/hadoop-grid-tasktracker-rac2.localdomain.out

  rac1: starting tasktracker, logging to /home/grid/hadoop-0.20.2/bin/../logs/hadoop-grid-tasktracker-rac1.localdomain.out

  9.        用jps檢驗各后臺進程是否成功啟動

  --在master節點查看后臺進程

  [grid@gc bin]$ /usr/java/jdk1.6.0_18/bin/jps

  27462 NameNode

  29012 Jps

  27672 JobTracker

  27607 SecondaryNameNode

  --在slave節點查看后臺進程

  [grid@rac1 conf]$ /usr/java/jdk1.6.0_18/bin/jps

  16722 Jps

  16672 TaskTracker

  16577 DataNode

  [grid@rac2 conf]$ /usr/java/jdk1.6.0_18/bin/jps

  31451 DataNode

  31547 TaskTracker

  31608 Jps

  10.     安裝過程中遇到的問題

  1)        Ssh不能建立互信

  建用戶時不指定分組,Ssh不能建立互信,如下的步驟

  [root@gc ~]# useradd grid

  [root@gc ~]# passwd grid

  解決:

  創建新的用戶組,創建用戶時并指定此用戶組。

  [root@gc ~]# groupadd hadoop

  [root@gc ~]# useradd -g hadoop grid

  [root@gc ~]# id grid

  uid=501(grid) gid=54326(hadoop) groups=54326(hadoop)

  [root@gc ~]# passwd grid

  2)        啟動hadoop后,slave節點沒有datanode進程

  現象:

  在master節點啟動hadoop后,master節點進程正常,但slave節點沒有datanode進程。

  --Master節點正常

  [grid@gc bin]$ /usr/java/jdk1.6.0_18/bin/jps
29843 Jps
29703 JobTracker
29634 SecondaryNameNode
29485 NameNode

--此時再在兩slave節點查看進程,發現還是沒有datanode進程
[grid@rac1 bin]$ /usr/java/jdk1.6.0_18/bin/jps
5528 Jps
3213 TaskTracker

[grid@rac2 bin]$ /usr/java/jdk1.6.0_18/bin/jps
30518 TaskTracker
30623 Jps

  原因:

  --回頭查看在master節點啟動hadoop時的輸出日志,在slave節點找到啟動datanode進程的日志

  [grid@rac2 logs]$ pwd

  /home/grid/hadoop-0.20.2/logs

  [grid@rac1 logs]$ more hadoop-grid-datanode-rac1.localdomain.log

  /************************************************************

  STARTUP_MSG: Starting DataNode

  STARTUP_MSG:   host = rac1.localdomain/192.168.2.101

  STARTUP_MSG:   args = []

  STARTUP_MSG:   version = 0.20.2

  STARTUP_MSG:   build = ; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010

  ************************************************************/

  2012-11-18 07:43:33,513 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: can not create directory: /usr/hadoop-0.20.2/data

  2012-11-18 07:43:33,513 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid.

  2012-11-18 07:43:33,571 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

  /************************************************************

  SHUTDOWN_MSG: Shutting down DataNode at rac1.localdomain/192.168.2.101

  ************************************************************/

  --發現是hdfs-site.xml配置文件的目錄data目錄沒有創建

  解決:

  在各節點創建hdfs的data目錄,并修改hdfs-site.xml配置文件參數

  [grid@gc ~]# mkdir -p /home/grid/hadoop-0.20.2/data

  [grid@gc conf]# vi hdfs-site.xml

  < xml version="1.0" >

  < xml-stylesheet type="text/xsl" href="configuration.xsl" >

  <!-- Put site-specific property overrides in this file. -->

  <configuration>

  <property>

  <name>dfs.data.dir</name>

  <value>/home/grid/hadoop-0.20.2/data</value> --注意此目錄必需已經創建并能讀寫

  </property>

  <property>

  <name>dfs.replication</name>

  <value>2</value>

  </property>

  </configuration>

  --重新啟動hadoop,slave進程正常

  [grid@gc bin]$ ./stop-all.sh

  [grid@gc bin]$ ./start-all.sh

以上是“如何完全分布式安裝Hadoop”這篇文章的所有內容,感謝各位的閱讀!相信大家都有了一定的了解,希望分享的內容對大家有所幫助,如果還想學習更多知識,歡迎關注億速云行業資訊頻道!

向AI問一下細節

免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。

AI

湘阴县| 上栗县| 如东县| 古蔺县| 台东县| 泾源县| 临城县| 睢宁县| 疏勒县| 宜宾市| 龙井市| 奉贤区| 文山县| 九江市| 郁南县| 伊宁县| 滕州市| 河源市| 福贡县| 辽宁省| 莒南县| 平果县| 资源县| 衡南县| 肇庆市| 乐陵市| 大冶市| 乌拉特前旗| 济宁市| 镇沅| 固阳县| 平武县| 永登县| 东海县| 曲阜市| 洪湖市| 奉贤区| 平乡县| 丰顺县| 芦溪县| 疏勒县|