搜索数据中心网络基准测试概述

问题描述

每隔几周,我们就会在一个托管服务提供商之一的数据中心内托管的某些群集服务中遇到麻烦。故障原因:服务集群成员正在进入准备就绪故障,因为达到了1秒的超时。然后,可以将该成员识别为还没有准备好再恢复同步几秒钟。当然,我们可以将超时时间增加到5s,10s甚至更多,但这可能意味着恢复同步将花费更多时间,等等。

细节并不重要,但它似乎与可能的网络问题有关,我们面临的挑战:企业级数据中心内部的网络应该带来什么?我们在ms级观察到一些ping峰值:

[1599828761.569137] 64 bytes from node-01 (10.0.1.12): icmp_seq=894 ttl=64 time=0.782 ms
[1599828762.569133] 64 bytes from node-01 (10.0.1.12): icmp_seq=895 ttl=64 time=0.764 ms
[1599828763.569087] 64 bytes from node-01 (10.0.1.12): icmp_seq=896 ttl=64 time=0.702 ms
[1599828764.569173] 64 bytes from node-01 (10.0.1.12): icmp_seq=897 ttl=64 time=0.802 ms
[1599828765.569108] 64 bytes from node-01 (10.0.1.12): icmp_seq=898 ttl=64 time=0.766 ms
[1599828766.569147] 64 bytes from node-01 (10.0.1.12): icmp_seq=899 ttl=64 time=0.763 ms
[1599828767.569123] 64 bytes from node-01 (10.0.1.12): icmp_seq=900 ttl=64 time=0.749 ms
[1599828768.569045] 64 bytes from node-01 (10.0.1.12): icmp_seq=901 ttl=64 time=0.678 ms
[1599828769.569069] 64 bytes from node-01 (10.0.1.12): icmp_seq=902 ttl=64 time=0.700 ms
[1599828770.569159] 64 bytes from node-01 (10.0.1.12): icmp_seq=903 ttl=64 time=0.804 ms
[1599828771.569084] 64 bytes from node-01 (10.0.1.12): icmp_seq=904 ttl=64 time=0.742 ms
[1599828772.578682] 64 bytes from node-01 (10.0.1.12): icmp_seq=905 ttl=64 time=10.2 ms
[1599828773.570497] 64 bytes from node-01 (10.0.1.12): icmp_seq=906 ttl=64 time=0.732 ms
[1599828774.569501] 64 bytes from node-01 (10.0.1.12): icmp_seq=907 ttl=64 time=0.736 ms
[1599828775.569106] 64 bytes from node-01 (10.0.1.12): icmp_seq=908 ttl=64 time=0.745 ms
[1599828776.569126] 64 bytes from node-01 (10.0.1.12): icmp_seq=909 ttl=64 time=0.761 ms
[1599828777.569114] 64 bytes from node-01 (10.0.1.12): icmp_seq=910 ttl=64 time=0.751 ms
[1599828778.569102] 64 bytes from node-01 (10.0.1.12): icmp_seq=911 ttl=64 time=0.711 ms
[1599828779.569143] 64 bytes from node-01 (10.0.1.12): icmp_seq=912 ttl=64 time=0.782 ms
[1599828780.569086] 64 bytes from node-01 (10.0.1.12): icmp_seq=913 ttl=64 time=0.730 ms
[1599828781.569197] 64 bytes from node-01 (10.0.1.12): icmp_seq=914 ttl=64 time=0.832 ms
[1599828782.569102] 64 bytes from node-01 (10.0.1.12): icmp_seq=915 ttl=64 time=0.754 ms
[1599828783.569116] 64 bytes from node-01 (10.0.1.12): icmp_seq=916 ttl=64 time=0.741 ms
[1599828784.569087] 64 bytes from node-01 (10.0.1.12): icmp_seq=917 ttl=64 time=0.723 ms
[1599828785.569101] 64 bytes from node-01 (10.0.1.12): icmp_seq=918 ttl=64 time=0.753 ms
[1599828786.569194] 64 bytes from node-01 (10.0.1.12): icmp_seq=919 ttl=64 time=0.813 ms
[1599828787.569132] 64 bytes from node-01 (10.0.1.12): icmp_seq=920 ttl=64 time=0.772 ms
[1599828788.569032] 64 bytes from node-01 (10.0.1.12): icmp_seq=921 ttl=64 time=0.693 ms
[1599828789.569071] 64 bytes from node-01 (10.0.1.12): icmp_seq=922 ttl=64 time=0.722 ms
[1599828790.569136] 64 bytes from node-01 (10.0.1.12): icmp_seq=923 ttl=64 time=0.789 ms
[1599828791.569371] 64 bytes from node-01 (10.0.1.12): icmp_seq=924 ttl=64 time=0.816 ms
[1599828792.569149] 64 bytes from node-01 (10.0.1.12): icmp_seq=925 ttl=64 time=0.778 ms
[1599828793.569110] 64 bytes from node-01 (10.0.1.12): icmp_seq=926 ttl=64 time=0.745 ms
[1599828794.569139] 64 bytes from node-01 (10.0.1.12): icmp_seq=927 ttl=64 time=0.761 ms
[1599828795.569153] 64 bytes from node-01 (10.0.1.12): icmp_seq=928 ttl=64 time=0.810 ms
[1599828796.585378] 64 bytes from node-01 (10.0.1.12): icmp_seq=929 ttl=64 time=16.9 ms
[1599828797.570244] 64 bytes from node-01 (10.0.1.12): icmp_seq=930 ttl=64 time=0.761 ms
[1599828798.569296] 64 bytes from node-01 (10.0.1.12): icmp_seq=931 ttl=64 time=0.813 ms
[1599828799.569069] 64 bytes from node-01 (10.0.1.12): icmp_seq=932 ttl=64 time=0.718 ms
[1599828800.569112] 64 bytes from node-01 (10.0.1.12): icmp_seq=933 ttl=64 time=0.765 ms
[1599828801.569122] 64 bytes from node-01 (10.0.1.12): icmp_seq=934 ttl=64 time=0.749 ms
[1599828802.569153] 64 bytes from node-01 (10.0.1.12): icmp_seq=935 ttl=64 time=0.776 ms
[1599828803.569099] 64 bytes from node-01 (10.0.1.12): icmp_seq=936 ttl=64 time=0.723 ms
[1599828804.569049] 64 bytes from node-01 (10.0.1.12): icmp_seq=937 ttl=64 time=0.691 ms
[1599828805.569073] 64 bytes from node-01 (10.0.1.12): icmp_seq=938 ttl=64 time=0.713 ms
[1599828806.569173] 64 bytes from node-01 (10.0.1.12): icmp_seq=939 ttl=64 time=0.748 ms
[1599828807.569141] 64 bytes from node-01 (10.0.1.12): icmp_seq=940 ttl=64 time=0.786 ms
[1599828808.569172] 64 bytes from node-01 (10.0.1.12): icmp_seq=941 ttl=64 time=0.788 ms
[1599828809.569101] 64 bytes from node-01 (10.0.1.12): icmp_seq=942 ttl=64 time=0.750 ms
[1599828810.569221] 64 bytes from node-01 (10.0.1.12): icmp_seq=943 ttl=64 time=0.790 ms
[1599828811.569141] 64 bytes from node-01 (10.0.1.12): icmp_seq=944 ttl=64 time=0.745 ms

看起来每分钟几次,我们的icmp软件包的响应时间达到了x10-x20。

这是不良,良好的正常行为吗?平均应该期待什么?在同一提供者的两个数据中心之间的一个数据中心内部,并且两者之间具有专用和冗余连接?峰是可接受的行为吗?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)