RHEL/CentOS 6.5-de CDH4 ulanyp, Hadoop Multinode Klasterini guruň


Hadoop, uly maglumatlary gaýtadan işlemek üçin apache tarapyndan işlenip düzülen açyk çeşme programmirleme çarçuwasydyr. Toplumdaky ähli datanodlar boýunça maglumatlary paýlaýjy görnüşde we maglumatlary gaýtadan işlemek üçin mapreduce modelinde saklamak üçin HDFS (Hadoop Distribution File System) ulanýar.

Namenode (NN) HDFS-i dolandyrýan ussat daemon we Jobtracker (JT) mapreduce hereketlendirijisi üçin ussat daemondyr.

Bu gollanmada iki sany CentOS 6.3 VM-ni “ussat” we “düwün” ulanýaryn. (ussat we düwün meniň host atlarym). “Ussat” IP 172.21.17.175, düwün IP bolsa “172.21.17.188”. Aşakdaky görkezmeler RHEL/CentOS 6.x wersiýalarynda hem işleýär.

 hostname

master
 ifconfig|grep 'inet addr'|head -1

inet addr:172.21.17.175  Bcast:172.21.19.255  Mask:255.255.252.0
 hostname

node
 ifconfig|grep 'inet addr'|head -1

inet addr:172.21.17.188  Bcast:172.21.19.255  Mask:255.255.252.0

Ilki bilen, DNS gurulmasa, ähli klaster eýeleriniň “/ etc/host” faýlynda (her düwünde) bardygyna göz ýetiriň.

 cat /etc/hosts

172.21.17.175 master
172.21.17.188 node
 cat /etc/hosts

172.21.17.197 qabox
172.21.17.176 ansible-ground

CentOS-da Hadoop Multinode Klasterini gurmak

Bir toparda ähli öý eýelerine (Master we Node) CDH4 gurmak üçin resmi CDH ammaryny ulanýarys.

Resmi CDH göçürip almak sahypasyna giriň we CDH4 (ýagny 4.6) wersiýasyny alyň ýa-da ammary göçürip almak we gurmak üçin aşakdaky wget buýrugyny ulanyp bilersiňiz.

# wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/i386/cloudera-cdh-4-0.i386.rpm
# yum --nogpgcheck localinstall cloudera-cdh-4-0.i386.rpm
# wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm
# yum --nogpgcheck localinstall cloudera-cdh-4-0.x86_64.rpm

Hadoop Multinode Klasterini gurmazdan ozal, ulgam arhitekturasyna laýyklykda aşakdaky buýruklardan birini işledip, ammaryňyza “Cloudera Public GPG” açaryny goşuň.

## on 32-bit System ##

# rpm --import http://archive.cloudera.com/cdh4/redhat/6/i386/cdh/RPM-GPG-KEY-cloudera
## on 64-bit System ##

# rpm --import http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera

Ondan soň, Master serwerinde JobTracker we NameNode gurmak we gurmak üçin aşakdaky buýrugy işlediň.

 yum clean all 
 yum install hadoop-0.20-mapreduce-jobtracker
 yum clean all
 yum install hadoop-hdfs-namenode

Secondene-de ikinji derejeli düwün gurmak üçin Master serwerinde aşakdaky buýruklary işlediň.

 yum clean all 
 yum install hadoop-hdfs-secondarynam

Ondan soň, JobTracker, NameNode we Ikinji (ýa-da garaşma) NameNode öý eýelerinden başga ähli klaster ýerlerinde (düwün) tasktracker we datanode guruň (bu ýagdaýda düwünde).

 yum clean all
 yum install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode

Hadoop müşderisini aýratyn enjamda gurnap bilersiňiz (bu ýagdaýda men ony datanode gurnadym, islendik enjamda gurnap bilersiňiz).

 yum install hadoop-client

Indi ýokardaky ädimler bilen ýerine ýetirilen bolsa, geliň, hdf-leri ýerleşdirmek üçin öňe gideliň (ähli düwünlerde edilmeli).

Bellenen konfigurasiýany /etc/hadoop katalogyna göçüriň (klasterdäki her düwünde).

 cp -r /etc/hadoop/conf.dist /etc/hadoop/conf.my_cluster
 cp -r /etc/hadoop/conf.dist /etc/hadoop/conf.my_cluster

Customörite katalogyňyzy aşakdaky ýaly düzmek üçin alternatiw buýrugy ulanyň (klasterdäki her düwünde).

 alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50
reading /var/lib/alternatives/hadoop-conf

 alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster
 alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50
reading /var/lib/alternatives/hadoop-conf

 alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster

Indi “core-site.xml” faýly açyň we klasterdäki her düwünde “fs.defaultFS” täzeläň.

 cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
 <name>fs.defaultFS</name>
 <value>hdfs://master/</value>
</property>
</configuration>
 cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
 <name>fs.defaultFS</name>
 <value>hdfs://master/</value>
</property>
</configuration>

Toplumdaky her düwündäki hdfs-site.xml-de “dfs.permission.superusergroup” indiki täzelenme.

 cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
     <name>dfs.name.dir</name>
     <value>/var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value>
  </property>
  <property>
     <name>dfs.permissions.superusergroup</name>
     <value>hadoop</value>
  </property>
</configuration>
 cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
     <name>dfs.name.dir</name>
     <value>/var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value>
  </property>
  <property>
     <name>dfs.permissions.superusergroup</name>
     <value>hadoop</value>
  </property>
</configuration>

Bellik: aboveokardaky konfigurasiýanyň ähli düwünlerde bardygyna göz ýetiriň (bir düwünde ýerine ýetiriň we düwünleriň galan bölegine göçürmek üçin scp işlediň).

“Hdfs-site.xml” -de “dfs.name.dir ýa-da dfs.namenode.name.dir” -ni NameNode-da (Master we Node-da) täzeläň. Bellenilişi ýaly bahany üýtgetmegiňizi haýyş edýäris.

 cat /etc/hadoop/conf/hdfs-site.xml
<property>
 <name>dfs.namenode.name.dir</name>
 <value>file:///data/1/dfs/nn,/nfsmount/dfs/nn</value>
</property>
 cat /etc/hadoop/conf/hdfs-site.xml
<property>
 <name>dfs.datanode.data.dir</name>
 <value>file:///data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn</value>
</property>

Katalog gurluşyny döretmek we Namenode (Master) we Datanode (Node) enjamynda ulanyjy rugsatlaryny dolandyrmak üçin aşakdaky buýruklary ýerine ýetiriň.

 mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn
 chmod 700 /data/1/dfs/nn /nfsmount/dfs/nn
  mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
  chown -R hdfs:hdfs /data/1/dfs/nn /nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn

Aşakdaky buýrugy bermek bilen Namenode (Master-de) formatlaň.

 sudo -u hdfs hdfs namenode -format

Aşakdaky emlägi hdfs-site.xml faýlyna goşuň we Master-de görkezilişi ýaly bahany çalşyň.

<property>
  <name>dfs.namenode.http-address</name>
  <value>172.21.17.175:50070</value>
  <description>
    The address and port on which the NameNode UI will listen.
  </description>
</property>

Bellik: Biziň ýagdaýymyzda ussat VM-iň ip adresi bolmaly.

Indi MRv1 ýerleşdireliň (Kartany azaltmak 1-nji wersiýasy). Görkezilişi ýaly aşakdaky bahalardan “mapred-site.xml” faýly açyň.

 cp hdfs-site.xml mapred-site.xml
 vi mapred-site.xml
 cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
 <name>mapred.job.tracker</name>
 <value>master:8021</value>
</property>
</configuration>

Ondan soň, aşakdaky scp buýrugyny ulanyp, “mapred-site.xml” faýly düwün maşynyna göçüriň.

 scp /etc/hadoop/conf/mapred-site.xml node:/etc/hadoop/conf/
mapred-site.xml                                                                      100%  200     0.2KB/s   00:00

Indi MRv1 Daemons tarapyndan ulanmak üçin ýerli ammar kataloglaryny düzüň. “Mapred-site.xml” faýlyny açyň we her TaskTracker üçin aşakda görkezilişi ýaly üýtgeşmeler giriziň.

<property>
 <name>mapred.local.dir</name>
 <value>/data/1/mapred/local,/data/2/mapred/local,/data/3/mapred/local</value>
</property>

Bu kartalary “mapred-site.xml” faýlynda görkezeniňizden soň, kataloglary döretmeli we klasteriňizdäki her düwünde olara dogry faýl rugsatlaryny bermeli.

mkdir -p /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local /data/4/mapred/local
chown -R mapred:hadoop /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local /data/4/mapred/local

Indi klasterdäki her düwünde HDFS başlamak üçin aşakdaky buýrugy işlediň.

 for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
 for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done

Aşakda görkezilişi ýaly dogry rugsatlar bilen/tmp döretmek talap edilýär.

 sudo -u hdfs hadoop fs -mkdir /tmp
 sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
 sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
 sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
 sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred

Indi HDFS Faýl gurluşyny barlaň.

 sudo -u hdfs hadoop fs -ls -R /

drwxrwxrwt   - hdfs hadoop          	0 2014-05-29 09:58 /tmp
drwxr-xr-x   	- hdfs hadoop          	0 2014-05-29 09:59 /var
drwxr-xr-x  	- hdfs hadoop          	0 2014-05-29 09:59 /var/lib
drwxr-xr-x   	- hdfs hadoop         	0 2014-05-29 09:59 /var/lib/hadoop-hdfs
drwxr-xr-x   	- hdfs hadoop          	0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache
drwxr-xr-x   	- mapred hadoop          0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred
drwxr-xr-x   	- mapred hadoop          0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred/mapred
drwxrwxrwt   - mapred hadoop          0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

HDFS açyp, '/ tmp' döredeniňizden soň, ýöne JobTracker başlamazdan ozal 'mapred.system.dir' parametri bilen görkezilen HDFS katalogyny dörediň (deslapky & # 36 {hadoop.tmp.dir}/mapred/system) we eýesini karta görnüşinde üýtgediň.

 sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system
 sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system

MapReduce başlamak üçin: TT we JT hyzmatlaryny başlamagyňyzy haýyş edýäris.

 service hadoop-0.20-mapreduce-tasktracker start

Starting Tasktracker:                               [  OK  ]
starting tasktracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-tasktracker-node.out
 service hadoop-0.20-mapreduce-jobtracker start

Starting Jobtracker:                                [  OK  ]

starting jobtracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-jobtracker-master.out

Ondan soň, her hadoop ulanyjysy üçin öý katalogyny dörediň. muny NameNode-da etmegiňiz maslahat berilýär; meselem.

 sudo -u hdfs hadoop fs -mkdir  /user/<user>
 sudo -u hdfs hadoop fs -chown <user> /user/<user>

Bellik: nirede her ulanyjynyň Linux ulanyjy ady.

Ativea-da bolmasa, öý katalogyny aşakdaky ýaly ýatyryp bilersiňiz.

 sudo -u hdfs hadoop fs -mkdir /user/$USER
 sudo -u hdfs hadoop fs -chown $USER /user/$USER

Brauzeriňizi açyň we Namenode girmek üçin url http:// ip_address_of_namenode: 50070 diýip ýazyň.

Brauzeriňizde başga bir tab açyň we JobTracker-e girmek üçin url-y http:// ip_address_of_jobtracker: 50030 diýip ýazyň.

Bu amal RHEL/CentOS 5.X/6.X-de üstünlikli synag edildi. Gurmak bilen baglanyşykly haýsydyr bir mesele bilen ýüzbe-ýüz bolsaňyz, aşakda düşündiriş bermegiňizi haýyş edýärin, çözgütler bilen size kömek ederin.