RHEL/CentOS 6.5-de CDH4 ulanyp, Hadoop Multinode Klasterini guruň
Hadoop, uly maglumatlary gaýtadan işlemek üçin apache tarapyndan işlenip düzülen açyk çeşme programmirleme çarçuwasydyr. Toplumdaky ähli datanodlar boýunça maglumatlary paýlaýjy görnüşde we maglumatlary gaýtadan işlemek üçin mapreduce modelinde saklamak üçin HDFS (Hadoop Distribution File System) ulanýar.
Namenode (NN) HDFS-i dolandyrýan ussat daemon we Jobtracker (JT) mapreduce hereketlendirijisi üçin ussat daemondyr.
Bu gollanmada iki sany CentOS 6.3 VM-ni “ussat” we “düwün” ulanýaryn. (ussat we düwün meniň host atlarym). “Ussat” IP 172.21.17.175, düwün IP bolsa “172.21.17.188”. Aşakdaky görkezmeler RHEL/CentOS 6.x wersiýalarynda hem işleýär.
hostname master
ifconfig|grep 'inet addr'|head -1 inet addr:172.21.17.175 Bcast:172.21.19.255 Mask:255.255.252.0
hostname node
ifconfig|grep 'inet addr'|head -1 inet addr:172.21.17.188 Bcast:172.21.19.255 Mask:255.255.252.0
Ilki bilen, DNS gurulmasa, ähli klaster eýeleriniň “/ etc/host” faýlynda (her düwünde) bardygyna göz ýetiriň.
cat /etc/hosts 172.21.17.175 master 172.21.17.188 node
cat /etc/hosts 172.21.17.197 qabox 172.21.17.176 ansible-ground
CentOS-da Hadoop Multinode Klasterini gurmak
Bir toparda ähli öý eýelerine (Master we Node) CDH4 gurmak üçin resmi CDH ammaryny ulanýarys.
Resmi CDH göçürip almak sahypasyna giriň we CDH4 (ýagny 4.6) wersiýasyny alyň ýa-da ammary göçürip almak we gurmak üçin aşakdaky wget buýrugyny ulanyp bilersiňiz.
# wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/i386/cloudera-cdh-4-0.i386.rpm # yum --nogpgcheck localinstall cloudera-cdh-4-0.i386.rpm
# wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm # yum --nogpgcheck localinstall cloudera-cdh-4-0.x86_64.rpm
Hadoop Multinode Klasterini gurmazdan ozal, ulgam arhitekturasyna laýyklykda aşakdaky buýruklardan birini işledip, ammaryňyza “Cloudera Public GPG” açaryny goşuň.
## on 32-bit System ## # rpm --import http://archive.cloudera.com/cdh4/redhat/6/i386/cdh/RPM-GPG-KEY-cloudera
## on 64-bit System ## # rpm --import http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
Ondan soň, Master serwerinde JobTracker we NameNode gurmak we gurmak üçin aşakdaky buýrugy işlediň.
yum clean all yum install hadoop-0.20-mapreduce-jobtracker
yum clean all yum install hadoop-hdfs-namenode
Secondene-de ikinji derejeli düwün gurmak üçin Master serwerinde aşakdaky buýruklary işlediň.
yum clean all yum install hadoop-hdfs-secondarynam
Ondan soň, JobTracker, NameNode we Ikinji (ýa-da garaşma) NameNode öý eýelerinden başga ähli klaster ýerlerinde (düwün) tasktracker we datanode guruň (bu ýagdaýda düwünde).
yum clean all yum install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode
Hadoop müşderisini aýratyn enjamda gurnap bilersiňiz (bu ýagdaýda men ony datanode gurnadym, islendik enjamda gurnap bilersiňiz).
yum install hadoop-client
Indi ýokardaky ädimler bilen ýerine ýetirilen bolsa, geliň, hdf-leri ýerleşdirmek üçin öňe gideliň (ähli düwünlerde edilmeli).
Bellenen konfigurasiýany /etc/hadoop katalogyna göçüriň (klasterdäki her düwünde).
cp -r /etc/hadoop/conf.dist /etc/hadoop/conf.my_cluster
cp -r /etc/hadoop/conf.dist /etc/hadoop/conf.my_cluster
Customörite katalogyňyzy aşakdaky ýaly düzmek üçin alternatiw buýrugy ulanyň (klasterdäki her düwünde).
alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50 reading /var/lib/alternatives/hadoop-conf alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster
alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50 reading /var/lib/alternatives/hadoop-conf alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster
Indi “core-site.xml” faýly açyň we klasterdäki her düwünde “fs.defaultFS” täzeläň.
cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master/</value> </property> </configuration>
cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master/</value> </property> </configuration>
Toplumdaky her düwündäki hdfs-site.xml-de “dfs.permission.superusergroup” indiki täzelenme.
cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.name.dir</name> <value>/var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value> </property> <property> <name>dfs.permissions.superusergroup</name> <value>hadoop</value> </property> </configuration>
cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.name.dir</name> <value>/var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value> </property> <property> <name>dfs.permissions.superusergroup</name> <value>hadoop</value> </property> </configuration>
Bellik: aboveokardaky konfigurasiýanyň ähli düwünlerde bardygyna göz ýetiriň (bir düwünde ýerine ýetiriň we düwünleriň galan bölegine göçürmek üçin scp işlediň).
“Hdfs-site.xml” -de “dfs.name.dir ýa-da dfs.namenode.name.dir” -ni NameNode-da (Master we Node-da) täzeläň. Bellenilişi ýaly bahany üýtgetmegiňizi haýyş edýäris.
cat /etc/hadoop/conf/hdfs-site.xml
<property> <name>dfs.namenode.name.dir</name> <value>file:///data/1/dfs/nn,/nfsmount/dfs/nn</value> </property>
cat /etc/hadoop/conf/hdfs-site.xml
<property> <name>dfs.datanode.data.dir</name> <value>file:///data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn</value> </property>
Katalog gurluşyny döretmek we Namenode (Master) we Datanode (Node) enjamynda ulanyjy rugsatlaryny dolandyrmak üçin aşakdaky buýruklary ýerine ýetiriň.
mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn chmod 700 /data/1/dfs/nn /nfsmount/dfs/nn
mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn chown -R hdfs:hdfs /data/1/dfs/nn /nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
Aşakdaky buýrugy bermek bilen Namenode (Master-de) formatlaň.
sudo -u hdfs hdfs namenode -format
Aşakdaky emlägi hdfs-site.xml faýlyna goşuň we Master-de görkezilişi ýaly bahany çalşyň.
<property> <name>dfs.namenode.http-address</name> <value>172.21.17.175:50070</value> <description> The address and port on which the NameNode UI will listen. </description> </property>
Bellik: Biziň ýagdaýymyzda ussat VM-iň ip adresi bolmaly.
Indi MRv1 ýerleşdireliň (Kartany azaltmak 1-nji wersiýasy). Görkezilişi ýaly aşakdaky bahalardan “mapred-site.xml” faýly açyň.
cp hdfs-site.xml mapred-site.xml vi mapred-site.xml cat mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>master:8021</value> </property> </configuration>
Ondan soň, aşakdaky scp buýrugyny ulanyp, “mapred-site.xml” faýly düwün maşynyna göçüriň.
scp /etc/hadoop/conf/mapred-site.xml node:/etc/hadoop/conf/ mapred-site.xml 100% 200 0.2KB/s 00:00
Indi MRv1 Daemons tarapyndan ulanmak üçin ýerli ammar kataloglaryny düzüň. “Mapred-site.xml” faýlyny açyň we her TaskTracker üçin aşakda görkezilişi ýaly üýtgeşmeler giriziň.
<property> Â <name>mapred.local.dir</name> Â <value>/data/1/mapred/local,/data/2/mapred/local,/data/3/mapred/local</value> </property>
Bu kartalary “mapred-site.xml” faýlynda görkezeniňizden soň, kataloglary döretmeli we klasteriňizdäki her düwünde olara dogry faýl rugsatlaryny bermeli.
mkdir -p /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local /data/4/mapred/local chown -R mapred:hadoop /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local /data/4/mapred/local
Indi klasterdäki her düwünde HDFS başlamak üçin aşakdaky buýrugy işlediň.
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
Aşakda görkezilişi ýaly dogry rugsatlar bilen/tmp döretmek talap edilýär.
sudo -u hdfs hadoop fs -mkdir /tmp sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred
Indi HDFS Faýl gurluşyny barlaň.
sudo -u hdfs hadoop fs -ls -R / drwxrwxrwt - hdfs hadoop 0 2014-05-29 09:58 /tmp drwxr-xr-x - hdfs hadoop 0 2014-05-29 09:59 /var drwxr-xr-x - hdfs hadoop 0 2014-05-29 09:59 /var/lib drwxr-xr-x - hdfs hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs drwxr-xr-x - hdfs hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache drwxr-xr-x - mapred hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred drwxr-xr-x - mapred hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred/mapred drwxrwxrwt - mapred hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
HDFS açyp, '/ tmp' döredeniňizden soň, ýöne JobTracker başlamazdan ozal 'mapred.system.dir' parametri bilen görkezilen HDFS katalogyny dörediň (deslapky & # 36 {hadoop.tmp.dir}/mapred/system) we eýesini karta görnüşinde üýtgediň.
sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system
MapReduce başlamak üçin: TT we JT hyzmatlaryny başlamagyňyzy haýyş edýäris.
service hadoop-0.20-mapreduce-tasktracker start Starting Tasktracker: [ OK ] starting tasktracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-tasktracker-node.out
service hadoop-0.20-mapreduce-jobtracker start Starting Jobtracker: [ OK ] starting jobtracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-jobtracker-master.out
Ondan soň, her hadoop ulanyjysy üçin öý katalogyny dörediň. muny NameNode-da etmegiňiz maslahat berilýär; meselem.
sudo -u hdfs hadoop fs -mkdir /user/<user> sudo -u hdfs hadoop fs -chown <user> /user/<user>
Bellik: nirede her ulanyjynyň Linux ulanyjy ady.
Ativea-da bolmasa, öý katalogyny aşakdaky ýaly ýatyryp bilersiňiz.
sudo -u hdfs hadoop fs -mkdir /user/$USER sudo -u hdfs hadoop fs -chown $USER /user/$USER
Brauzeriňizi açyň we Namenode girmek üçin url http:// ip_address_of_namenode: 50070 diýip ýazyň.
Brauzeriňizde başga bir tab açyň we JobTracker-e girmek üçin url-y http:// ip_address_of_jobtracker: 50030 diýip ýazyň.
Bu amal RHEL/CentOS 5.X/6.X-de üstünlikli synag edildi. Gurmak bilen baglanyşykly haýsydyr bir mesele bilen ýüzbe-ýüz bolsaňyz, aşakda düşündiriş bermegiňizi haýyş edýärin, çözgütler bilen size kömek ederin.