具有HA hadoop名称节点的CDH 4.7无法启动

问题描述

I have inherited a CDH 4.7 cluster and last week someone decided it would be a good idea to do kernel updates and then just reboot all nodes at once (bang head here).  This of course caused a huge issue and I can't start either the active or standby namenodes.  We can see from below it does the fast-forwarding thru the streams and then just fails.   Any idea how I can go about fixing?

请注意,它从未与CM一起安装,但已经运行了多年了

2020-08-16 10:34:23,310 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Reading org.apache.hadoop.hdfs.server.namenode.RedundanteditLogInputStream@16f68d93 expecting start txid #2516829933
2020-08-16 10:34:23,310 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: fast-forwarding stream 'http://hdpdq05a.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A1628828306%3A0%3ACID-d349672d-9bb8-4db1-8fbb-c6e9bf50a3f6,http://hdpdq05b.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A1628828306%3A0%3ACID-d349672d-9bb8-4db1-8fbb-c6e9bf50a3f6' to transaction ID 2516478939
2020-08-16 10:34:23,310 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: fast-forwarding stream 'http://hdpdq05a.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A1628828306%3A0%3ACID-d349672d-9bb8-4db1-8fbb-c6e9bf50a3f6' to transaction ID 2516478939
2020-08-16 10:34:23,322 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Edits file http://hdpdq05a.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A1628828306%3A0%3ACID-d349672d-9bb8-4db1-8fbb-c6e9bf50a3f6,http://hdpdq05b.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A1628828306%3A0%3ACID-d349672d-9bb8-4db1-8fbb-c6e9bf50a3f6 of size 1048576 edits # 407 loaded in 0 seconds
2020-08-16 10:34:23,393 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 530 entries 107888 lookups
2020-08-16 10:34:23,404 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 14078 msecs
2020-08-16 10:34:23,595 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020
2020-08-16 10:34:23,617 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean
2020-08-16 10:34:23,619 WARN org.apache.hadoop.hdfs.server.common.Util: Path /disk1/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2020-08-16 10:34:23,639 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
Connection to hdpdq05b closed by remote host.undException: File does not exist: /acs/prod/data/B/2020/08/12/1455/out/_logs/history/job_202008120816_0353_1597244146771_mapred_oozie%3AactionConnection to hdpdq05b closed.cs-B-wf1%3AA%3Dphase
-bash-4.1$ org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:4487)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setBlockTotal(FSNamesystem.java:4458)```

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)