当正在进行许多还原时,虚拟机的并行还原失败 (webvirtcloud)

问题描述

我正在 webvirtcloud 之上构建项目(webvirtcloud 使用 libvirt-python)。

我有几个线程。它们中的每一个都通过 revertToSnapshot 方法恢复不同的虚拟机 (VM)。如果有 3 个线程,那么一切都很顺利。如果有超过 3 个线程,则在每个线程中大约 7 秒后 revertToSnapshot 失败并出现异常 libvirt: XML-RPC error: Cannot recv data: Connection reset by peer。需要注意的是,尽管 revertToSnapshot 方法失败,还原过程仍在继续,VM 将被还原。

我尝试与 10 个不同的 VM 创建 10 个连接,因此 获取 VM 的状态。我的目的是检查我是否可以创建 10 个同时连接。运行正常。

我尝试在每个线程中创建新连接。实现看起来像:

def main():
    ...
    # This is main thread code snippet
    t = threading.Thread(thread_job_funk,(arg1,agr2,))
    t.start()
    ...


def thread_job():
    ...
    connection = wvmInstance(host,login,passwd,conn,vname)
    connection.snapshot_revert("snapshot_name")  
    ...

我还尝试创建一个连接并在每个线程中使用它(是的,由于 libvirt python 绑定描述,这是合法的)。实现看起来像:

vm_host_connection = wvmConnect(ip_addr,password,connection_type)

def main():
    ...
    # This is main thread code snippet
    t = threading.Thread(thread_job_funk,))
    t.start()
    ...


def thread_job():
    ...
    global vm_host_connection 
    vir_domain_connection = vm_host_connection .get_instance(instance_name)
    snapshot = vir_domain.snapshotLookupByName(snapshot_name,0)
    vir_domain.revertToSnapshot(snapshot,0) 
    ...

然后我尝试查看 libvirtd(主机上的 libvirt 守护程序)日志以获取连接到 keepalive 机制的消息,我没有发现任何消息。我重现问题期间的 Keepalive 日志:

debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,client=0xxxxxxxxxxxx2,msg=0xxxxxxxxxxxx3
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx4
info : virKeepAliveTimerInternal:136 : RPC_KEEPALIVE_TIMEOUT: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 countToDeath=5 idle=5
debug : virKeepAliveMessage:104 : Sending keepalive request to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,client=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxx13
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:395 : Got keepalive request from client 0xxxxxxxxxxxx5
debug : virKeepAliveMessage:104 : Sending keepalive response to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxxx9
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:400 : Got keepalive response from client 0xxxxxxxxxxxx5
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx7
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx8
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx9
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxx10
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx8
info : virKeepAliveTimerInternal:136 : RPC_KEEPALIVE_TIMEOUT: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 countToDeath=5 idle=5
debug : virKeepAliveMessage:104 : Sending keepalive request to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxx11
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:395 : Got keepalive request from client 0xxxxxxxxxxxx5
debug : virKeepAliveMessage:104 : Sending keepalive response to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxx12
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:400 : Got keepalive response from client 0xxxxxxxxxxxx5
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxx12
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx4
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx7
info : virKeepAliveTimerInternal:136 : RPC_KEEPALIVE_TIMEOUT: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 countToDeath=5 idle=5
debug : virKeepAliveMessage:104 : Sending keepalive request to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxx13
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx3
info : virKeepAliveTimerInternal:136 : RPC_KEEPALIVE_TIMEOUT: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 countToDeath=5 idle=5
debug : virKeepAliveMessage:104 : Sending keepalive request to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxxx8
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:395 : Got keepalive request from client 0xxxxxxxxxxxx5
debug : virKeepAliveMessage:104 : Sending keepalive response to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxx11
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:400 : Got keepalive response from client 0xxxxxxxxxxxx5
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0x5585245c7680
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:395 : Got keepalive request from client 0xxxxxxxxxxxx5
debug : virKeepAliveMessage:104 : Sending keepalive response to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxx10
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:400 : Got keepalive response from client 0xxxxxxxxxxxx5
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0xxxxxxxxxxxx9
info : virKeepAliveTimerInternal:136 : RPC_KEEPALIVE_TIMEOUT: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 countToDeath=5 idle=5
debug : virKeepAliveMessage:104 : Sending keepalive request to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxxx7
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:400 : Got keepalive response from client 0xxxxxxxxxxxx5
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx1,msg=0x55852460af60
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=1
debug : virKeepAliveCheckMessage:395 : Got keepalive request from client 0xxxxxxxxxxxx5
debug : virKeepAliveMessage:104 : Sending keepalive response to client 0xxxxxxxxxxxx5
info : virKeepAliveMessage:107 : RPC_KEEPALIVE_SEND: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:374 : ka=0xxxxxxxxxxxx5,msg=0xxxxxxxxxxxx4
info : virKeepAliveCheckMessage:391 : RPC_KEEPALIVE_RECEIVED: ka=0xxxxxxxxxxxx5 client=0xxxxxxxxxxxx5 prog=xxxxxxxxx5 vers=1 proc=2
debug : virKeepAliveCheckMessage:400 : Got keepalive response from client 0xxxxxxxxxxxx5

然后我尝试查找 libvirtd 的错误和警告,并且在重现问题时没有任何错误或警告。

当线程超过 3 个时,为什么方法会失败?

注意事项:

  • snapshot_revert 的内部使用 libvirt-python 的 revertToSnapshot 方法。
  • 据我所知,webvirtcloud 是使用 libvirt-python 构建的

P.S.:请随时询问任何其他信息。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)