在 GRPC 中处理处于 IO 等待状态的 goroutine 的优雅方式

问题描述

我们的服务器(grpc-gateway + grpc)在 K8S 上运行,go 1.13,并终止堆栈信息

    Last State:    Terminated
      Reason:      Error
      Message:     o.(*Reader).fill(0xc002ec78c0)
                   /Users/local/go/src/bufio/bufio.go:100 +0x103
bufio.(*Reader).Peek(0xc002ec78c0,0x4,0x0,0xc002e79ad0)
  /Users/local/go/src/bufio/bufio.go:138 +0x4f
net/http.(*conn).readRequest(0xc00245a000,0x1ef4800,0xc000be7780,0x0)
  /Users/local/go/src/net/http/server.go:962 +0xb3b
net/http.(*conn).serve(0xc00245a000,0xc000be7780)
  /Users/local/go/src/net/http/server.go:1817 +0x6d4
created by net/http.(*Server).Serve
  /Users/local/go/src/net/http/server.go:2928 +0x384

goroutine 8724981 [IO wait]:
internal/poll.runtime_pollWait(0x7f5d3a8f84e8,0x72,0xffffffffffffffff)
  /Users/local/go/src/runtime/netpoll.go:184 +0x55
internal/poll.(*pollDesc).wait(0xc000155d98,0x1000,0xffffffffffffffff)
  /Users/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
  /Users/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000155d80,0xc00138b000,0x0)
  /Users/local/go/src/internal/poll/fd_unix.go:169 +0x1cf
net.(*netFD).Read(0xc000155d80,0xc00142b9e8,0x4ce13d,0xc000155d80)
  /Users/local/go/src/net/fd_unix.go:202 +0x4f
net.(*conn).Read(0xc0000dd690,0x0)
  /Users/local/go/src/net/net.go:184 +0x68
net/http.(*connReader).Read(0xc001e9b050,0x0)
  /Users/local/go/src/net/http/server.go:785 +0xf4
bufio.(*Reader).fill(0xc0021fe060)
  /Users/local/go/src/bufio/bufio.go:100 +0x103
bufio.(*Reader).Peek(0xc0021fe060,0xc00142bad0)
  /Users/local/go/src/bufio/bufio.go:138 +0x4f
net/http.(*conn).readRequest(0xc0008f43c0,0xc000503640,0x0)
  /Users/local/go/src/net/http/server.go:962 +0xb3b
net/http.(*conn).serve(0xc0008f43c0,0xc000503640)
  /Users/local/go/src/net/http/server.go:1817 +0x6d4
created by net/http.(*Server).Serve
  /Users/local/go/src/net/http/server.go:2928 +0x384

      Exit Code:    2

根据此question,一种可能的解决方案是

>  s := new(http.Server)
>  // ...
>  s.ReadTimeout = 5 * time.Second
>  s.WriteTimeout = 5 * time.Second
>  // ...

然而,我们未能从 ReadTimeout 中找到 grpc.NewServer,我们是否遗漏了什么?或者如何更优雅地处理GRPC中IO等待状态的goroutine?

grpc 版本是 v1.21.1

解决方法

gprc.NewServer 允许在创建过程中使用零个或多个 ServerOption

func NewServer(opt ...ServerOption) *Server

虽然似乎没有与 ReadTimeout 等效的 WriteTimeouthttp.Server,但您可以尝试 keepalive.ServerParameters

type ServerParameters struct {
    MaxConnectionIdle     time.Duration // The current default value is infinity.
    MaxConnectionAge      time.Duration // The current default value is infinity.
    MaxConnectionAgeGrace time.Duration // The current default value is infinity.
    Time                  time.Duration // The current default value is 2 hours.
    Timeout               time.Duration // The current default value is 20 seconds.
}

(完整文档 keepalive.ServerParameters

并将 keepalive.ServerParameters.Time 调到 2 小时以内:

srv := grpc.NewServer(
    keepalive.ServerParameters{Time:5*time.Minute},)

这将降低连接重用 - 但也会释放早已失效的客户端连接。

,

您将上下文与超时一起使用。

https://golang.org/pkg/context/