问题描述
我使用bitnami舵图在kubernetes上安装了mongodb集群。 https://github.com/bitnami/charts/tree/master/bitnami/mongodb。您可以在https://pastebin.com/41rc3JC1
看到我编辑过的生产yaml文件我已经运行了2-3周。我已经注意到,当我使用cli进行查询时,偶尔会出现以下错误:
rs0:PRIMARY> db.streamers.find().pretty()
错误:执行查询时出错: 失败:尝试在主机上运行命令“查找”时出现网络错误 'mongo.acme.com:27017'
我使用以下网址mongo mongodb://<redacted_username>:<redacted_password>@mongo.acme.com:27017
连接到mongo
rs.status
{
"set" : "rs0","date" : ISODate("2020-09-16T11:09:40.136Z"),"myState" : 1,"term" : NumberLong(3),"syncingTo" : "","syncSourceHost" : "","syncSourceId" : -1,"heartbeatIntervalMillis" : NumberLong(2000),"majorityVoteCount" : 2,"writeMajorityCount" : 1,"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1600254574,1),"t" : NumberLong(3)
},"lastCommittedWallTime" : ISODate("2020-09-16T11:09:34.606Z"),"readConcernMajorityOpTime" : {
"ts" : Timestamp(1600254574,"readConcernMajorityWallTime" : ISODate("2020-09-16T11:09:34.606Z"),"appliedOpTime" : {
"ts" : Timestamp(1600254574,"durableOpTime" : {
"ts" : Timestamp(1600254574,"lastAppliedWallTime" : ISODate("2020-09-16T11:09:34.606Z"),"lastDurableWallTime" : ISODate("2020-09-16T11:09:34.606Z")
},"lastStableRecoveryTimestamp" : Timestamp(1600254574,"lastStableCheckpointTimestamp" : Timestamp(1600254574,"electionCandidateMetrics" : {
"lastElectionReason" : "electionTimeout","lastElectionDate" : ISODate("2020-09-16T10:27:44.421Z"),"electionTerm" : NumberLong(3),"lastCommittedOpTimeAtElection" : {
"ts" : Timestamp(1600241404,"t" : NumberLong(2)
},"lastSeenOpTimeAtElection" : {
"ts" : Timestamp(1600241404,"numVotesNeeded" : 2,"priorityAtElection" : 5,"electionTimeoutMillis" : NumberLong(10000),"numCatchUpOps" : NumberLong(0),"newTermStartDate" : ISODate("2020-09-16T10:27:44.439Z"),"wMajorityWriteAvailabilityDate" : ISODate("2020-09-16T10:27:44.502Z")
},"members" : [
{
"_id" : 0,"name" : "a211af6c97d4847519c8a859471a1846-<redacted>.us-east-2.elb.amazonaws.com:27017","health" : 1,"state" : 1,"stateStr" : "PRIMARY","uptime" : 3975838,"optime" : {
"ts" : Timestamp(1600254574,"t" : NumberLong(3)
},"optimeDate" : ISODate("2020-09-16T11:09:34Z"),"infoMessage" : "","electionTime" : Timestamp(1600252064,"electionDate" : ISODate("2020-09-16T10:27:44Z"),"configVersion" : 2,"self" : true,"lastHeartbeatMessage" : ""
},{
"_id" : 1,"name" : "mongo-prod-mongodb-arbiter-0.mongo-prod-mongodb-arbiter-headless.mongodb.svc.cluster.local:27017","state" : 7,"stateStr" : "ARBITER","uptime" : 2507,"lastHeartbeat" : ISODate("2020-09-16T11:09:39.159Z"),"lastHeartbeatRecv" : ISODate("2020-09-16T11:09:38.700Z"),"pingMs" : NumberLong(0),"lastHeartbeatMessage" : "","configVersion" : 2
}
],"ok" : 1,"$clusterTime" : {
"clusterTime" : Timestamp(1600254574,"signature" : {
"hash" : BinData(0,"7pmSYWIweERYYg/QjqIIjQrQmo4="),"keyId" : NumberLong("6855964983600087044")
}
},"operationTime" : Timestamp(1600254574,1)
}
从yaml中可以看到,此mongodb集群通过在AWS上创建的负载均衡器暴露在外部。然后,我设置了一个cname,它将mongo.acme.com
指向负载均衡器的URL。
为什么我偶尔会遇到这些网络错误?是我应该关注的事情吗?您能看到我配置错误的任何内容吗?
如果您需要我提供更多信息,请询问。
解决方法
将其发布为Community Wiki
以获得更好的可见性,并且未确认为possible root cause
。
错误,例如:
failed: network error while attempting to run command 'XXX' on host 'mongo.acme.com:27017'
通常发生在mongo server
和mongo client shell
之间版本不匹配的问题中。
在互联网上可以找到许多示例,例如whatismyuri command或saslStart。
第二个最常见的根本原因是尝试在没有SSL
的情况下连接到需要它的MongoDB server
,如this troubleshoot guide中所述。
解决此问题的方法是为server
和mongo shell
使用相同的版本,并检查MongoDB
是否需要SSL connection
。
但是,如果问题更复杂,则必须仔细检查日志。