在调用堆栈之外调用 OperationFuture 时 SynchronizationContext 中的异常

问题描述

我正在使用 Google 的 DocumentAI SDK,但该错误似乎源于 gRPC SDK。我正在 DocumentAI 中调用一个异步操作,它返回一个 OperationFuture。当我在创建未来的调用堆栈框架内调用方法 OperationFuture.get() 时,代码会正确阻塞,直到未来完成并正常继续。但是,如果创建未来的方法返回并且我在其创建框架之外调用 OperationFuture.get(),我总是会收到以下堆栈跟踪的异常

io.grpc.internal.ManagedChannelImpl$2 uncaughtException
SEVERE: [Channel<1>: (us-documentai.googleapis.com:443)] Uncaught exception in the SynchronizationContext. Panic!
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@60d4b478 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@1e3a60f5[Terminated,pool size = 0,active threads = 0,queued tasks = 0,completed tasks = 0]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
    at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
    at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
    at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
    at java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:622)
    at io.grpc.internal.ManagedChannelImpl$RealChannel$PendingCall.reprocess(ManagedChannelImpl.java:1089)
    at io.grpc.internal.ManagedChannelImpl$RealChannel.updateConfigSelector(ManagedChannelImpl.java:1022)
    at io.grpc.internal.ManagedChannelImpl$NameResolverListener$1NamesResolved.run(ManagedChannelImpl.java:1729)
    at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:95)
    at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:127)
    at io.grpc.internal.ManagedChannelImpl$NameResolverListener.onResult(ManagedChannelImpl.java:1815)
    at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:333)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

示例代码

public class Engine {
    public void startAsync() {
      ...
      OperationFuture future = googleClient.doAsyncRequest();
      // calling future.get(); here works fine
      this.operationFuture = future;
      ...
    }
}

public class Main {
  public static void main(String[] args) {
    ...
    Engine eng = new Engine();
    eng.startAsync();
    eng.operationFuture.get(); // this doesn't work
    ...
  }
}

解决方法

我已经找到了问题的根源。操作在堆栈框架内工作而不在堆栈框架外工作的原因是因为创建和管理此 googleClientoperationFuture 对象有自己的 ExecutorService 来处理 operationFuture 生命周期。一旦我们从 startAsync() 方法返回,googleClient 对象就会超出范围并被释放,这也会释放与其关联并破坏 operationFuture 对象的所有线程。

为了解决这个问题,googleClient 对象必须与 operationFuture 一起在内存中保持活动状态。以 DocumentAI 为例:

public void startAsync() {
    ...
    this.googleClient = DocumentProcessorServiceClient.create(docAISettings);
    this.operationFuture = this.googleClient.doAsyncRequest();
    return;
}

只要未来对象和客户端对象都没有超出范围,从外部调用 operationFuture.get() 方法现在可以正常工作。

我尝试为 googleClient 对象提供自定义线程池并让它被垃圾收集(即 googleClient 对象死亡但运行 operationFuture 的线程没有)但它似乎不起作用,不知道为什么。