博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
hadoop HA架构安装部署(QJM HA)
阅读量:6814 次
发布时间:2019-06-26

本文共 23341 字,大约阅读时间需要 77 分钟。

###################HDFS High Availability Using the Quorum Journal Manager################################

规划集群

db01            db02            db03            db04            db05

namenode        namenode

journalnode        journalnode        journalnode       

datanode        datanode        datanode        datanode        datanode

编辑配置文件core-site.xml:

[hadoop@db01 hadoop]$ cat core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

   

  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
            <name>fs.defaultFS</name>
            <value>hdfs://ns1</value>
    </property>

        <property>

                <name>hadoop.tmp.dir</name>
                <value>/usr/local/hadoop-2.5.0/data/tmp</value>
        </property>

        <property>

                <name>fs.trash.interval</name>
                <value>7000</value>
        </property>
</configuration>

编辑配置文件hdfs-site.xml:

[hadoop@db01 hadoop]$ cat hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

   

  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

        <property>

               <name>dfs.nameservices</name>
               <value>ns1</value>
        </property>

    <property>

        <name>dfs.ha.namenodes.ns1</name>
        <value>nn1,nn2</value>
    </property>

    <property>

        <name>dfs.namenode.rpc-address.ns1.nn1</name>
        <value>db01:8020</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.ns1.nn2</name>
        <value>db02:8020</value>
    </property>

    <property>

        <name>dfs.namenode.http-address.ns1.nn1</name>
        <value>db01:50070</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.ns1.nn2</name>
        <value>db02:50070</value>
    </property>

    <property>

        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://db01:8485;db02:8485;db03:8485/ns1</value>
    </property>
        <property>
                <name>dfs.journalnode.edits.dir</name>
                <value>/usr/local/hadoop-2.5.0/data/dfs/jn</value>
        </property>

    <property>

        <name>dfs.client.failover.proxy.provider.ns1</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <property>

        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/hadoop/.ssh/id_dsa</value>
    </property>

</configuration>

同步文件:

[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db02:/usr/local/hadoop-2.5.0/etc/hadoop/

core-site.xml                                                                                                                                                           100% 1140     1.1KB/s   00:00   
hdfs-site.xml                                                                                                                                                           100% 2067     2.0KB/s   00:00   
[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db03:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml                                                                                                                                                           100% 1140     1.1KB/s   00:00   
hdfs-site.xml                                                                                                                                                           100% 2067     2.0KB/s   00:00   
[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db04:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml                                                                                                                                                           100% 1140     1.1KB/s   00:00   
hdfs-site.xml                                                                                                                                                           100% 2067     2.0KB/s   00:00   
[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db05:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml                                                                                                                                                           100% 1140     1.1KB/s   00:00   
hdfs-site.xml                     

启动集群:

1)启动journalnode服务

[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode

starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db01.out
[hadoop@db01 hadoop-2.5.0]$ jps
738 Jps
688 JournalNode

[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode

starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.out
[hadoop@db02 hadoop-2.5.0]$ jps
16813 Jps
16763 JournalNode

[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode

starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.out
[hadoop@db02 hadoop-2.5.0]$ jps
16813 Jps
16763 JournalNode

2)格式化hdfs文件系统

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs namenode -format

在nn1上启动namenode:

[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode

starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db01.out

3)在nn2节点上同步nn1节点元数据(也可以直接cp元数据)

[hadoop@db02 hadoop-2.5.0]$ bin/hdfs namenode -bootstrapStandby

4)启动nn2上的namenode服务

[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode

starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db02.out

5)启动所有的datanode服务

[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode

starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db01.out
[hadoop@db01 hadoop-2.5.0]$ jps
1255 DataNode
1001 NameNode
1339 Jps
688 JournalNode

[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode

starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db02.out
[hadoop@db02 hadoop-2.5.0]$ jps
17112 DataNode
17193 Jps
16763 JournalNode
16956 NameNode

[hadoop@db03 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode

starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db03.out
[hadoop@db03 hadoop-2.5.0]$ jps
15813 JournalNode
15995 Jps
15921 DataNode

[hadoop@db04 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode

starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db04.out
[hadoop@db04 hadoop-2.5.0]$ jps
14660 DataNode
14734 Jps

[hadoop@db05 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode

starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db05.out
[hadoop@db05 hadoop-2.5.0]$ jps
22165 DataNode
22239 Jps

6)将nn1切换成active状态

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToActive nn1

17/03/12 03:04:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1

17/03/12 03:06:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 03:06:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$

测试hfds文件系统:

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -mkdir -p /user/hadoop/conf

17/03/12 03:14:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@db01 hadoop-2.5.0]$

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -ls -R /
17/03/12 03:14:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
drwxr-xr-x   - hadoop supergroup          0 2017-03-12 03:11 /user
drwxr-xr-x   - hadoop supergroup          0 2017-03-12 03:14 /user/hadoop
drwxr-xr-x   - hadoop supergroup          0 2017-03-12 03:14 /user/hadoop/conf
[hadoop@db01 hadoop-2.5.0]$
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -put etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml /user/hadoop/conf/
17/03/12 03:14:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -ls -R /
17/03/12 03:15:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
drwxr-xr-x   - hadoop supergroup          0 2017-03-12 03:11 /user
drwxr-xr-x   - hadoop supergroup          0 2017-03-12 03:14 /user/hadoop
drwxr-xr-x   - hadoop supergroup          0 2017-03-12 03:14 /user/hadoop/conf
-rw-r--r--   3 hadoop supergroup       1140 2017-03-12 03:14 /user/hadoop/conf/core-site.xml
-rw-r--r--   3 hadoop supergroup       2061 2017-03-12 03:14 /user/hadoop/conf/hdfs-site.xml
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -text /user/hadoop/conf/core-site.xml
17/03/12 03:16:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

   

  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
            <name>fs.defaultFS</name>
            <value>hdfs://ns1</value>
    </property>

        <property>

                <name>hadoop.tmp.dir</name>
                <value>/usr/local/hadoop-2.5.0/data/tmp</value>
        </property>

        <property>

                <name>fs.trash.interval</name>
                <value>7000</value>
        </property>
</configuration>

手工方式验证active和standby的相互转换:

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToStandby nn1

17/03/12 03:20:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToActive nn2

17/03/12 03:20:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1

17/03/12 03:21:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 03:21:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -text /user/hadoop/conf/core-site.xml
17/03/12 03:22:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

   

  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
            <name>fs.defaultFS</name>
            <value>hdfs://ns1</value>
    </property>

        <property>

                <name>hadoop.tmp.dir</name>
                <value>/usr/local/hadoop-2.5.0/data/tmp</value>
        </property>

        <property>

                <name>fs.trash.interval</name>
                <value>7000</value>
        </property>
</configuration>

使用zookeeper实现hadoop ha的自动故障转移(failover)功能

hdfs-site.xml file, add:

<property>

   <name>dfs.ha.automatic-failover.enabled</name>
   <value>true</value>
</property>

core-site.xml file, add:

<property>

   <name>ha.zookeeper.quorum</name>
   <value>db01:2181,db02:2181,db03:2181,db04:2181,db05:2181</value>
</property>

关闭hdfs集群,并且同步文件:

[hadoop@db01 hadoop-2.5.0]$ sbin/stop-dfs.sh

17/03/12 13:07:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [db01 db02]
db01: stopping namenode
db02: stopping namenode
db01: stopping datanode
db02: stopping datanode
db05: stopping datanode
db04: stopping datanode
db03: stopping datanode
Stopping journal nodes [db01 db02 db03]
db01: stopping journalnode
db03: stopping journalnode
db02: stopping journalnode
17/03/12 13:07:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping ZK Failover Controllers on NN hosts [db01 db02]
db01: no zkfc to stop
db02: no zkfc to stop

[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db02:/usr/local/hadoop-2.5.0/etc/hadoop/

core-site.xml                                                                                                                                                           100% 1269     1.2KB/s   00:00   
hdfs-site.xml                                                                                                                                                           100% 2158     2.1KB/s   00:00   
[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db03:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml                                                                                                                                                           100% 1269     1.2KB/s   00:00   
hdfs-site.xml                                                                                                                                                           100% 2158     2.1KB/s   00:00   
[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db04:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml                                                                                                                                                           100% 1269     1.2KB/s   00:00   
hdfs-site.xml                                                                                                                                                           100% 2158     2.1KB/s   00:00   
[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db05:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml                                                                                                                                                           100% 1269     1.2KB/s   00:00   
hdfs-site.xml                                                                                                                                                           100% 2158     2.1KB/s   00:00

启动zookeeper集群:

[hadoop@db01 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start

JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@db01 hadoop-2.5.0]$ jps
10341 Jps
10319 QuorumPeerMain

[hadoop@db02 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start

JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... jpSTARTED
[hadoop@db02 hadoop-2.5.0]$ jps
22296 QuorumPeerMain
22320 Jps

[hadoop@db03 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start

JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@db03 hadoop-2.5.0]$ jps
17290 QuorumPeerMain
17325 Jps

[hadoop@db04 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start

JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@db04 hadoop-2.5.0]$ jps
15908 Jps
15877 QuorumPeerMain

[hadoop@db05 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start

JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@db05 hadoop-2.5.0]$ jps
23412 Jps
23379 QuorumPeerMain

hadoop初始化zk:

[zk: localhost:2181(CONNECTED) 1] ls /
[zookeeper]

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs zkfc -formatZK

[zk: localhost:2181(CONNECTED) 3] ls /

[hadoop-ha, zookeeper]

启动hdfs集群:

[hadoop@db01 hadoop-2.5.0]$ sbin/start-dfs.sh

17/03/12 13:19:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [db01 db02]
db01: starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db01.out
db02: starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db02.out
db01: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db01.out
db05: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db05.out
db02: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db02.out
db04: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db04.out
db03: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db03.out
Starting journal nodes [db01 db02 db03]
db02: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.out
db01: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db01.out
db03: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db03.out
17/03/12 13:19:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [db01 db02]
db02: starting zkfc, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-zkfc-db02.out
db01: starting zkfc, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-zkfc-db01.out
[hadoop@db01 hadoop-2.5.0]$ jps
8382 Jps
7931 DataNode
8125 JournalNode
32156 QuorumPeerMain
7816 NameNode
8315 DFSZKFailoverController

测试自动故障转移功能:

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 13:51:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 13:51:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active

[hadoop@db02 hadoop-2.5.0]$ jps

22296 QuorumPeerMain
22377 NameNode
22458 DataNode
22775 Jps
22691 DFSZKFailoverController
22553 JournalNode
[hadoop@db02 hadoop-2.5.0]$
[hadoop@db02 hadoop-2.5.0]$

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 14:23:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 14:23:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$

[hadoop@db02 hadoop-2.5.0]$ kill -9 25121

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1

17/03/12 14:24:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 14:24:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/12 14:24:51 INFO ipc.Client: Retrying connect to server: db02/192.168.100.232:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From db01/192.168.100.231 to db02:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: 

--------------------------------------------------------------------------------

[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 14:28:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 14:28:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$ jps
16276 Jps
15675 JournalNode
15363 NameNode
15871 DFSZKFailoverController
10319 QuorumPeerMain
15478 DataNode
[hadoop@db01 hadoop-2.5.0]$ kill -9 15363
[hadoop@db01 hadoop-2.5.0]$
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 14:28:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/12 14:28:49 INFO ipc.Client: Retrying connect to server: db01/192.168.100.231:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From db01/192.168.100.231 to db01:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: 
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 14:28:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
------------------------------------------------------------------------------------------------------

##############################zk ha自动切换可能遇到错误##############################################################

2017-03-12 13:58:54,210 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NN

java.net.ConnectException: Call From db01/192.168.100.231 to db02:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
        at org.apache.hadoop.ipc.Client.call(Client.java:1415)
        at org.apache.hadoop.ipc.Client.call(Client.java:1364)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:139)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:411)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:606)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:700)
        at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463)
        at org.apache.hadoop.ipc.Client.call(Client.java:1382)
        ... 11 more

问题原因:

    我的问题找到了 在 HDFS的配置文件中  我的 fencing的选的密钥文件不对,我的是 dsa 不是 rsa的加密类型。改一下就OK 了

##############################zk ha自动切换可能遇到错误##############################################################

转载地址:http://azqwl.baihongyu.com/

你可能感兴趣的文章
vba 工作案例1
查看>>
利用Python了解微信通信机制,实现查询有多少好友删除你!!
查看>>
【mybatis深度历险系列】mybatis中的动态sql
查看>>
瑞典驻华参赞:智慧城市建设提升为国家战略
查看>>
淘富成真,硬件智能—— 硬件创新一站赋能平台
查看>>
网友神总结:我们继续用 XP 的十大理由
查看>>
2014年8月份国内主浏览器市场份额排行榜
查看>>
优云automation实践技巧:简单4步完成自动化构建与发布
查看>>
用Dart搭建HTTP服务器(2)
查看>>
如何恢复丢失的分区及文件
查看>>
人生的五度修炼
查看>>
逆波兰表达式的实现
查看>>
struts2中的action获取web资源
查看>>
windows中的hosts文件
查看>>
C语言 学生宿舍管理系统
查看>>
node.js Web实时消息后台服务器推送技术---GoEasy
查看>>
修改mysql数据存放路径
查看>>
Linux学习笔记5
查看>>
Linux学习笔记11
查看>>
Java线程唤醒与阻塞的定义与使用方法
查看>>