规划集群
db01 db02 db03 db04 db05
namenode namenode
journalnode journalnode journalnode
datanode datanode datanode datanode datanode
编辑配置文件core-site.xml:
[hadoop@db01 hadoop]$ cat core-site.xml
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. -->
<configuration>
<property> <name>fs.defaultFS</name> <value>hdfs://ns1</value> </property><property>
<name>hadoop.tmp.dir</name> <value>/usr/local/hadoop-2.5.0/data/tmp</value> </property><property>
<name>fs.trash.interval</name> <value>7000</value> </property></configuration>编辑配置文件hdfs-site.xml:
[hadoop@db01 hadoop]$ cat hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.nameservices</name> <value>ns1</value> </property><property>
<name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property><property>
<name>dfs.namenode.rpc-address.ns1.nn1</name> <value>db01:8020</value> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>db02:8020</value> </property><property>
<name>dfs.namenode.http-address.ns1.nn1</name> <value>db01:50070</value> </property> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>db02:50070</value> </property><property>
<name>dfs.namenode.shared.edits.dir</name> <value>qjournal://db01:8485;db02:8485;db03:8485/ns1</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/usr/local/hadoop-2.5.0/data/dfs/jn</value> </property><property>
<name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property><property>
<name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_dsa</value> </property></configuration>
同步文件:
[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db02:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1140 1.1KB/s 00:00 hdfs-site.xml 100% 2067 2.0KB/s 00:00 [hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db03:/usr/local/hadoop-2.5.0/etc/hadoop/core-site.xml 100% 1140 1.1KB/s 00:00 hdfs-site.xml 100% 2067 2.0KB/s 00:00 [hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db04:/usr/local/hadoop-2.5.0/etc/hadoop/core-site.xml 100% 1140 1.1KB/s 00:00 hdfs-site.xml 100% 2067 2.0KB/s 00:00 [hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db05:/usr/local/hadoop-2.5.0/etc/hadoop/core-site.xml 100% 1140 1.1KB/s 00:00 hdfs-site.xml启动集群:
1)启动journalnode服务
[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db01.out[hadoop@db01 hadoop-2.5.0]$ jps738 Jps688 JournalNode[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.out[hadoop@db02 hadoop-2.5.0]$ jps16813 Jps16763 JournalNode[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.out[hadoop@db02 hadoop-2.5.0]$ jps16813 Jps16763 JournalNode2)格式化hdfs文件系统
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs namenode -format
在nn1上启动namenode:
[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db01.out3)在nn2节点上同步nn1节点元数据(也可以直接cp元数据)
[hadoop@db02 hadoop-2.5.0]$ bin/hdfs namenode -bootstrapStandby
4)启动nn2上的namenode服务
[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db02.out 5)启动所有的datanode服务[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db01.out[hadoop@db01 hadoop-2.5.0]$ jps1255 DataNode1001 NameNode1339 Jps688 JournalNode[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db02.out[hadoop@db02 hadoop-2.5.0]$ jps17112 DataNode17193 Jps16763 JournalNode16956 NameNode[hadoop@db03 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db03.out[hadoop@db03 hadoop-2.5.0]$ jps15813 JournalNode15995 Jps15921 DataNode[hadoop@db04 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db04.out[hadoop@db04 hadoop-2.5.0]$ jps14660 DataNode14734 Jps[hadoop@db05 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db05.out[hadoop@db05 hadoop-2.5.0]$ jps22165 DataNode22239 Jps6)将nn1切换成active状态
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToActive nn1
17/03/12 03:04:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 03:06:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableactive[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn217/03/12 03:06:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablestandby[hadoop@db01 hadoop-2.5.0]$ 测试hfds文件系统:[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -mkdir -p /user/hadoop/conf
17/03/12 03:14:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable[hadoop@db01 hadoop-2.5.0]$
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -ls -R /17/03/12 03:14:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicabledrwxr-xr-x - hadoop supergroup 0 2017-03-12 03:11 /userdrwxr-xr-x - hadoop supergroup 0 2017-03-12 03:14 /user/hadoopdrwxr-xr-x - hadoop supergroup 0 2017-03-12 03:14 /user/hadoop/conf[hadoop@db01 hadoop-2.5.0]$ [hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -put etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml /user/hadoop/conf/17/03/12 03:14:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -ls -R /17/03/12 03:15:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicabledrwxr-xr-x - hadoop supergroup 0 2017-03-12 03:11 /userdrwxr-xr-x - hadoop supergroup 0 2017-03-12 03:14 /user/hadoopdrwxr-xr-x - hadoop supergroup 0 2017-03-12 03:14 /user/hadoop/conf-rw-r--r-- 3 hadoop supergroup 1140 2017-03-12 03:14 /user/hadoop/conf/core-site.xml-rw-r--r-- 3 hadoop supergroup 2061 2017-03-12 03:14 /user/hadoop/conf/hdfs-site.xml[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -text /user/hadoop/conf/core-site.xml17/03/12 03:16:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. -->
<configuration>
<property> <name>fs.defaultFS</name> <value>hdfs://ns1</value> </property><property>
<name>hadoop.tmp.dir</name> <value>/usr/local/hadoop-2.5.0/data/tmp</value> </property><property>
<name>fs.trash.interval</name> <value>7000</value> </property></configuration> 手工方式验证active和standby的相互转换:[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToStandby nn1
17/03/12 03:20:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToActive nn2
17/03/12 03:20:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 03:21:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablestandby[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn217/03/12 03:21:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableactive[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -text /user/hadoop/conf/core-site.xml17/03/12 03:22:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. -->
<configuration>
<property> <name>fs.defaultFS</name> <value>hdfs://ns1</value> </property><property>
<name>hadoop.tmp.dir</name> <value>/usr/local/hadoop-2.5.0/data/tmp</value> </property><property>
<name>fs.trash.interval</name> <value>7000</value> </property></configuration>使用zookeeper实现hadoop ha的自动故障转移(failover)功能
hdfs-site.xml file, add:
<property>
<name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>core-site.xml file, add:
<property>
<name>ha.zookeeper.quorum</name> <value>db01:2181,db02:2181,db03:2181,db04:2181,db05:2181</value> </property>关闭hdfs集群,并且同步文件:
[hadoop@db01 hadoop-2.5.0]$ sbin/stop-dfs.sh
17/03/12 13:07:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStopping namenodes on [db01 db02]db01: stopping namenodedb02: stopping namenodedb01: stopping datanodedb02: stopping datanodedb05: stopping datanodedb04: stopping datanodedb03: stopping datanodeStopping journal nodes [db01 db02 db03]db01: stopping journalnodedb03: stopping journalnodedb02: stopping journalnode17/03/12 13:07:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStopping ZK Failover Controllers on NN hosts [db01 db02]db01: no zkfc to stopdb02: no zkfc to stop[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db02:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1269 1.2KB/s 00:00 hdfs-site.xml 100% 2158 2.1KB/s 00:00 [hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db03:/usr/local/hadoop-2.5.0/etc/hadoop/core-site.xml 100% 1269 1.2KB/s 00:00 hdfs-site.xml 100% 2158 2.1KB/s 00:00 [hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db04:/usr/local/hadoop-2.5.0/etc/hadoop/core-site.xml 100% 1269 1.2KB/s 00:00 hdfs-site.xml 100% 2158 2.1KB/s 00:00 [hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db05:/usr/local/hadoop-2.5.0/etc/hadoop/core-site.xml 100% 1269 1.2KB/s 00:00 hdfs-site.xml 100% 2158 2.1KB/s 00:00 启动zookeeper集群:[hadoop@db01 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by defaultUsing config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@db01 hadoop-2.5.0]$ jps10341 Jps10319 QuorumPeerMain[hadoop@db02 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by defaultUsing config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfgStarting zookeeper ... jpSTARTED[hadoop@db02 hadoop-2.5.0]$ jps22296 QuorumPeerMain22320 Jps[hadoop@db03 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by defaultUsing config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@db03 hadoop-2.5.0]$ jps17290 QuorumPeerMain17325 Jps[hadoop@db04 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by defaultUsing config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@db04 hadoop-2.5.0]$ jps15908 Jps15877 QuorumPeerMain[hadoop@db05 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by defaultUsing config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@db05 hadoop-2.5.0]$ jps23412 Jps23379 QuorumPeerMainhadoop初始化zk:
[zk: localhost:2181(CONNECTED) 1] ls /[zookeeper][hadoop@db01 hadoop-2.5.0]$ bin/hdfs zkfc -formatZK
[zk: localhost:2181(CONNECTED) 3] ls /
[hadoop-ha, zookeeper]启动hdfs集群:
[hadoop@db01 hadoop-2.5.0]$ sbin/start-dfs.sh
17/03/12 13:19:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting namenodes on [db01 db02]db01: starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db01.outdb02: starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db02.outdb01: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db01.outdb05: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db05.outdb02: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db02.outdb04: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db04.outdb03: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db03.outStarting journal nodes [db01 db02 db03]db02: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.outdb01: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db01.outdb03: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db03.out17/03/12 13:19:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting ZK Failover Controllers on NN hosts [db01 db02]db02: starting zkfc, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-zkfc-db02.outdb01: starting zkfc, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-zkfc-db01.out[hadoop@db01 hadoop-2.5.0]$ jps8382 Jps7931 DataNode8125 JournalNode32156 QuorumPeerMain7816 NameNode8315 DFSZKFailoverController 测试自动故障转移功能: [hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn117/03/12 13:51:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablestandby[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn217/03/12 13:51:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableactive[hadoop@db02 hadoop-2.5.0]$ jps
22296 QuorumPeerMain22377 NameNode22458 DataNode22775 Jps22691 DFSZKFailoverController22553 JournalNode[hadoop@db02 hadoop-2.5.0]$ [hadoop@db02 hadoop-2.5.0]$ [hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn117/03/12 14:23:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablestandby[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn217/03/12 14:23:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableactive[hadoop@db01 hadoop-2.5.0]$[hadoop@db02 hadoop-2.5.0]$ kill -9 25121
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 14:24:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableactive[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn217/03/12 14:24:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable17/03/12 14:24:51 INFO ipc.Client: Retrying connect to server: db02/192.168.100.232:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)Operation failed: Call From db01/192.168.100.231 to db02:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:--------------------------------------------------------------------------------
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn117/03/12 14:28:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableactive[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn217/03/12 14:28:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablestandby[hadoop@db01 hadoop-2.5.0]$ jps16276 Jps15675 JournalNode15363 NameNode15871 DFSZKFailoverController10319 QuorumPeerMain15478 DataNode[hadoop@db01 hadoop-2.5.0]$ kill -9 15363[hadoop@db01 hadoop-2.5.0]$ [hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn117/03/12 14:28:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable17/03/12 14:28:49 INFO ipc.Client: Retrying connect to server: db01/192.168.100.231:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)Operation failed: Call From db01/192.168.100.231 to db01:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: [hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn217/03/12 14:28:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableactive------------------------------------------------------------------------------------------------------##############################zk ha自动切换可能遇到错误##############################################################
2017-03-12 13:58:54,210 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NN
java.net.ConnectException: Call From db01/192.168.100.231 to db02:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) at org.apache.hadoop.ipc.Client.call(Client.java:1415) at org.apache.hadoop.ipc.Client.call(Client.java:1364) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:139) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:411) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:606) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:700) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463) at org.apache.hadoop.ipc.Client.call(Client.java:1382) ... 11 more问题原因:
我的问题找到了 在 HDFS的配置文件中 我的 fencing的选的密钥文件不对,我的是 dsa 不是 rsa的加密类型。改一下就OK 了
##############################zk ha自动切换可能遇到错误##############################################################