在 MongoDB 集群上主节点被杀死后,主节点选举未完成

问题描述

我尝试测试 mongoDB 集群的故障转移场景。当我停止主节点时,我在我的 Java 代码日志中没有看到任何新的主节点选举,并且读/写操作被忽略并得到以下结果:

No server chosen by ReadPreferenceServerSelector{readPreference=primary} from cluster description ClusterDescription{type=REPLICA_SET,connectionMode=MULTIPLE,serverDescriptions=[ServerDescription{address=mongo1:30001,type=UNKNowN,state=CONNECTING,exception={com.mongodb.MongoSocketopenException: Exception opening socket},caused by {java.net.ConnectException: Connection refused (Connection refused)}},ServerDescription{address=mongo2:30002,type=REPLICA_SET_SECONDARY,state=CONNECTED,ok=true,minWireversion=0,maxWireversion=8,maxDocumentSize=16777216,logicalSessionTimeoutMinutes=30,roundtripTimeNanos=3215664,setName='rs0',canonicalAddress=mongo2:30002,hosts=[mongo1:30001],passives=[mongo2:30002,mongo3:30003],arbiters=[],primary='null',tagSet=TagSet{[]},electionId=null,setVersion=1,lastWriteDate=Fri Mar 26 02:08:27 CET 2021,lastUpdateTimeNanos=91832460163658},ServerDescription{address=mongo3:30003,roundtripTimeNanos=3283858,canonicalAddress=mongo3:30003,lastUpdateTimeNanos=91832459878686}]}. Waiting for 30000 ms before timing out

我正在使用以下配置:

var cfg = {
    "_id": "rs0","protocolVersion": 1,"version": 1,"members": [
        {
            "_id": 0,"host": "mongo1:30001","priority": 4
        },{
            "_id": 1,"host": "mongo2:30002","priority": 3
        },{
            "_id": 2,"host": "mongo3:30003","priority": 2,}
    ]
};
rs.initiate(cfg,{ force: true });
rs.secondaryOk();
db.getMongo().setReadPref('primary');

rs.isMaster() 返回:

{
    "hosts" : [
        "mongo1:30001"
    ],"passives" : [
        "mongo2:30002","mongo3:30003"
    ],"setName" : "rs0","setVersion" : 1,"ismaster" : true,"secondary" : false,"primary" : "mongo1:30001","me" : "mongo1:30001","electionId" : ObjectId("7fffffff0000000000000017"),"lastWrite" : {
        "opTime" : {
            "ts" : Timestamp(1616719738,1),"t" : NumberLong(23)
        },"lastWriteDate" : ISODate("2021-03-26T00:48:58Z"),"majorityOpTime" : {
            "ts" : Timestamp(1616719738,"majorityWriteDate" : ISODate("2021-03-26T00:48:58Z")
    },"maxBsonobjectsize" : 16777216,"maxMessageSizeBytes" : 48000000,"maxWriteBatchSize" : 100000,"localTime" : ISODate("2021-03-26T00:49:08.019Z"),"logicalSessionTimeoutMinutes" : 30,"connectionId" : 28,"minWireversion" : 0,"maxWireversion" : 8,"readOnly" : false,"ok" : 1,"$clusterTime" : {
        "clusterTime" : Timestamp(1616719738,"signature" : {
            "hash" : BinData(0,"/+QXGSyYY+M/OXbZ1UixjrDOVz4="),"keyId" : NumberLong("6942620613131370499")
        }
    },"operationTime" : Timestamp(1616719738,1)
}

这里我看到的是主机列表有主节点,被动列表有辅助节点。我不知道什么时候会在集群设置中将所有节点都考虑在主机下,因此被动项将为空。我发现的唯一相关信息是次要的优先级不应为0。否则他们将不会被视为初选候选人。

        "mongo1:30001"
    ],...

解决方法

来自docs

isMaster.passives

格式为“[hostname]:[port]”的字符串数组,列出了所有成员[n].priority为0的副本集成员。

仅当至少有一个成员的 members[n].priority 为 0 时才会出现此字段。

那些节点以某种方式被设置为优先级 0,因此永远不会尝试成为主节点。