问题描述
我创建了一个 gRPC 服务器端流服务(在 Rust 中),它从 cluster A
上的 kafka 主题流式传输消息。每次外部客户端发起 websocket 连接时,cluster B
上的 Go 客户端都会连接到 rpc 服务;消息通过此连接流式传输到外部客户端。
然而,Go 客户端似乎并不可靠地从 Rust 服务器消费,(即使消息确实已经提交)。即使 websocket 肯定是从外部客户端读取信息;有时似乎不会从 grpc 流中消费。我已尝试阅读文档并确定可能导致此问题的任何资源泄漏,但我一直无法解决问题。
我认为问题可能出在上下文处理或关闭连接上(当流最初成功时,如果您关闭 websocket 并重新打开它,则消息将无法发送) >
Go 客户端代码
type KafkaChannel struct {
Client bridge.KafkaStreamClient
}
func (kc *KafkaChannel) Consume(topic string) (<-chan []byte,func() error) {
readChan := make(chan []byte)
// Context with cancellation to close the routine
ctx,cancel := context.WithCancel(context.Background())
stream,err := kc.Client.Consume(ctx,&bridge.ConsumeRequest{
Topic: topic,})
if err != nil {
log.Fatalf("Error creating cosumer stream: %v",err)
}
// Launch listener in new thread
go func(reader *chan []byte,consumeStream *bridge.KafkaStream_ConsumeClient) {
// Recover from a panic (resources consumed)
defer func() {
if recover() != nil {
log.Println("Consumer routine closed")
}
}()
for {
response,err := stream.Recv()
if err != nil {
log.Printf("Error creating cosumer stream: %v",err)
break
}
switch data := response.OptionalContent.(type) {
case *bridge.KafkaResponse_Content:
*reader <- *&data.Content
default:
break
}
}
}(&readChan,&stream)
// Create a callback func that frees the resources
closeCallback := func() error {
err := stream.CloseSend()
close(readChan)
cancel()
return err
}
return readChan,closeCallback
}
Websocket
type ChatHandler struct {
Database *db.Queries
Client *grpc.ClientConn
Context context.Context
SessionChannel chan []byte
}
func (handler *ChatHandler) GetChatConnection(c *websocket.Conn) {
//initialisation...
consumer,closeConsume := kChannel.Consume(topic)
for msg := range consumer {
log.Printf("Received message from bridge: %s",string(msg))
writeMessageStart := time.Now()
if err = c.WriteMessage(1,msg); err != nil {
log.Printf("Error writing message: %v",err)
writeMessageElapsed := time.Since(writeMessageStart)
log.Printf("Write time elapsed error: %s",writeMessageElapsed)
if errors.Is(err,syscall.EPIPE) {
log.Printf("Sys error: %v",err)
//continue
}
closeConsume()
handler.close(c)
return
}
writeMessageElapsed := time.Since(writeMessageStart)
log.Printf("Write time elapsed no error: %s",writeMessageElapsed)
}
}
Rust 服务器端代码
为了完整性
async fn consume(
&self,request: Request<ConsumeRequest>,) -> Result<Response<Self::ConsumeStream>,Status> {
let (tx,rx) = mpsc::unbounded_channel();
info!("Initiated read-only stream");
tokio::spawn(async move {
let message = match Some(request.get_ref()) {
Some(x) => x,None => return,};
let topic = message.topic.clone();
info!("Consuming on topic: {}",topic);
let consumer = create_kafka_consumer(topic);
loop {
let result = consumer.stream().next().await;
match result {
None => {
warn!("Received none-type from consumer stream");
continue;
}
Some(Err(e)) => {
error!("Error consuming from kafka broker: {:?}",e);
continue;
}
Some(Ok(message)) => {
let payload = match message.payload_view::<str>() {
None => {
warn!("Recived none-type when unpacking payload");
continue;
}
Some(Ok(s)) => {
info!("Received payload: {:?}",s);
s
}
Some(Err(e)) => {
error!("Error viewing payload contents: {}",e);
return;
}
};
info!("Received message from broker in read-only stream");
if payload.len() > 0 {
info!("Sending payload {:?}",payload);
match tx.send(Ok(KafkaResponse {
success: true,optional_content: Some(
bridge::kafka_response::OptionalContent::Content(
(*payload).as_bytes().to_vec(),),})) {
Ok(_) => info!("Successfully sent payload to client"),Err(e) => {
error!("GRPC error sending message to client {:?}",e);
return;
}
}
} else {
warn!("No content detected in payload from broker");
}
match consumer.commit_message(&message,CommitMode::Async) {
Ok(_) => (),Err(e) => {
error!("Error commiting a consumed message: {:?}",e);
return;
}
}
}
}
}
});
Ok(Response::new(Box::pin(
tokio_stream::wrappers::UnboundedReceiverStream::new(rx),)))
}
解决方法
问题是 Go 客户端的资源泄漏,在某种程度上,Rust 服务器也是如此。
优雅地关闭 Go 客户端流:
对于名为 npm
的流式 rpc
Consume
使用可取消的上下文初始化流很重要,以便运行时可以释放与多路复用 rpc 关联的资源。调用 ctx,cancel := context.WithCancel(context.Background())
stream,err := handler.Client.Consume(ctx,&bridge.ConsumeRequest{
Topic: topic,})
closeCallback := func() {
stream.CloseSend()
cancel()
c.Close() // where c := *websocket.Conn
}
defer closeCallback()
会关闭客户端的连接(这会在服务器尝试发送新消息时显示为错误)。
Rust 在服务器端检测 cancel()
。
在 Go 客户端调用 client.CloseSend()
不会通知服务器终止连接,它只是表示客户端将停止发送消息。在这种特殊情况下,我正在流式 rpc 连接中的异步线程中工作,因此为了在客户端从 websocket 断开连接时正常关闭连接,检测 stream.CloseSend()
数据包并关闭非常重要CloseSend
通道以关闭连接。
请注意,这样做的动机是服务器将继续尝试向“死”接收器发送数据包,否则它会认为连接仍然有效。
以下代码改编自this thread:
UnboundedReceiver