`impyla` 连接到 Dataproc 上的 Hive:impala.error.HiveServer2Error: Invalid OperationHandle: OperationHandle

问题描述

我正在使用 impyla 连接到 Dataproc 中的 Hive。连接是这样创建的

        conn = impala.dbapi.connect(
            host=host,port=10000,user=None,password=None,auth_mechanism='PLAIN',use_ssl=False,)

这个连接不需要用户名和密码。我可以使用 pyhivepyodbc 连接并执行间歇性成功的查询。使用impyla,它可以连接并进行一些步骤,但建立连接后总是失败。下面的日志显示了详细信息。

这总是失败,因此比 pyhivepyodbc 情况更可预测。我希望这是一些可以修复的配置问题,之后可能会比 pyhivepyodbc 更让我头疼。

有什么指点吗?谢谢!


pyhive_client: impyla

[2021-03-13 01:28:00.451 US/Pacific; INFO; vale.sql._hive,init,420] using Hive client "impyla"
[2021-03-13 01:28:01.243 US/Pacific; INFO; vale.gcp._dataproc,create_cluster,523] creating cluster "test-sql-hive-debug" ...
[2021-03-13 01:33:10.712 US/Pacific; INFO; vale.gcp._dataproc,535] cluster "test-sql-hive-debug" created
[2021-03-13 01:33:11.854 US/Pacific; INFO; vale.sql._hive,_get_host,713] using cluster "test-sql-hive-debug" at 10.22.174.52
[2021-03-13 01:33:11.854 US/Pacific; INFO; vale.sql._hive,connect,560] connecting to Hive server 10.22.174.52 ...
[2021-03-13 01:33:11.854 US/Pacific; DEBUG; impala.hiveserver2,789] Connecting to HiveServer2 10.22.174.52:10000 with PLAIN authentication mechanism
[2021-03-13 01:33:11.854 US/Pacific; DEBUG; impala._thrift_api,get_socket,109] get_socket: host=10.22.174.52 port=10000 use_ssl=False ca_cert=None
[2021-03-13 01:33:11.854 US/Pacific; DEBUG; impala.hiveserver2,822] sock=<thriftpy2.transport.socket.TSocket object at 0x7fb7b4e18f50>
[2021-03-13 01:33:11.855 US/Pacific; DEBUG; impala._thrift_api,get_transport,180] get_transport: socket=<thriftpy2.transport.socket.TSocket object at 0x7fb7b4e18f50> host=10.22.174.52 kerberos_service_name=impala auth_mechanism=PLAIN user=None password=fuggetaboutit
[2021-03-13 01:33:11.856 US/Pacific; DEBUG; impala._thrift_api,189] get_transport: user=root
[2021-03-13 01:33:11.856 US/Pacific; DEBUG; impala._thrift_api,196] get_transport: password=password
[2021-03-13 01:33:12.079 US/Pacific; DEBUG; impala.hiveserver2,835] transport=<thrift_sasl.TSaslClientTransport object at 0x7fb7b4e18c50> protocol=<thriftpy2.protocol.binary.TBinaryProtocol object at 0x7fb7b4e18cd0> service=<thriftpy2.thrift.TClient object at 0x7fb7b4e18650>
[2021-03-13 01:33:12.080 US/Pacific; DEBUG; impala.hiveserver2,58] HiveServer2Connection(service=<impala.hiveserver2.HS2Service object at 0x7fb7b4e18c10>,default_db=None)
[2021-03-13 01:33:12.080 US/Pacific; INFO; vale.sql._hive,571] ... connected to Hive server 10.22.174.52
[2021-03-13 01:33:12.080 US/Pacific; DEBUG; impala.hiveserver2,cursor,117] Getting a cursor (Impala session)
[2021-03-13 01:33:12.080 US/Pacific; DEBUG; impala.hiveserver2,122] .cursor(): getting new session_handle
[2021-03-13 01:33:12.080 US/Pacific; DEBUG; impala.hiveserver2,_log_request,1031] OpenSession: req=TOpenSessionReq(client_protocol=5,username='root',configuration=None)
[2021-03-13 01:33:12.081 US/Pacific; DEBUG; impala.hiveserver2,_execute,1006] Attempting to open transport (tries_left=3)
[2021-03-13 01:33:12.081 US/Pacific; DEBUG; impala.hiveserver2,1008] Transport opened
[2021-03-13 01:33:12.272 US/Pacific; DEBUG; impala.hiveserver2,_log_response,1034] OpenSession: resp=TOpenSessionResp(status=TStatus(statusCode=0,infoMessages=None,sqlState=None,errorCode=None,errorMessage=None),serverProtocolVersion=5,sessionHandle=TSessionHandle(sessionId=THandleIdentifier(guid=b'\xbb4\x1d\xe6\x08\x08Mw\xb4\xafu\xbcO\xab\xe9B',secret=b'\xd9>\r\xdc\x9d\x15Kk\xb9K\xc13\r\x1d\x9a\xd0')),configuration={'hive.server2.thrift.resultset.default.fetch.size': '1000'})
[2021-03-13 01:33:12.272 US/Pacific; DEBUG; impala.hiveserver2,129] HiveServer2Cursor(service=<impala.hiveserver2.HS2Service object at 0x7fb7b4e18c10>,session_handle=TSessionHandle(sessionId=THandleIdentifier(guid=b'\xbb4\x1d\xe6\x08\x08Mw\xb4\xafu\xbcO\xab\xe9B',default_config={'hive.server2.thrift.resultset.default.fetch.size': '1000'},hs2_protocol_version=5)
[2021-03-13 01:33:12.272 US/Pacific; DEBUG; vale.sql._sql,execute,51] executing sql statement:
SHOW DATABASES
args:
()
kwargs:
{}
[2021-03-13 01:33:12.273 US/Pacific; DEBUG; impala.hiveserver2,execute_async,356] Executing query SHOW DATABASES
[2021-03-13 01:33:12.273 US/Pacific; DEBUG; impala.hiveserver2,_debug_log_state,386] _execute_async: self._buffer=Batch() self._description=None self._last_operation_active=False self._last_operation=None
[2021-03-13 01:33:12.273 US/Pacific; DEBUG; impala.hiveserver2,_reset_state,296] _reset_state: Resetting cursor state
[2021-03-13 01:33:12.273 US/Pacific; DEBUG; impala.hiveserver2,386] _execute_async: self._buffer=Batch() self._description=None self._last_operation_active=False self._last_operation=None
[2021-03-13 01:33:12.274 US/Pacific; DEBUG; impala.hiveserver2,1031] ExecuteStatement: req=TExecuteStatementReq(sessionHandle=TSessionHandle(sessionId=THandleIdentifier(guid=b'\xbb4\x1d\xe6\x08\x08Mw\xb4\xafu\xbcO\xab\xe9B',statement='SHOW DATABASES',confOverlay={'hive.auto.convert.join': 'false','hive.auto.convert.join.noconditionaltask': 'false','hive.bfd.odd.reducers': 'true','hive.cli.print.header': 'true','hive.exec.max.created.files': '500000','hive.exec.max.dynamic.partitions.pernode': '1000000','hive.exec.parallel': 'true','hive.groupby.orderby.position.alias': 'true','hive.hadoop.supports.splittable.combineinputformat': 'true','hive.merge.size.per.task': '2048000000','hive.merge.smallfiles.avgsize': '2048000000','hive.resultset.use.unique.column.names': 'false','mapred.min.split.size': '2048000000','mapred.max.split.size': '2048000000','mapred.map.tasks': '64','mapred.reduce.tasks': '64','mapreduce.input.fileinputformat.split.minsize': '268435456','mapreduce.input.fileinputformat.split.maxsize': '536870912','mapreduce.job.reduces': '3000','mapreduce.map.memory.mb': '36864','mapreduce.reduce.memory.mb': '36864','mapreduce.map.child.java.opts': '-Xmx36864M','mapreduce.reduce.child.java.opts': '-Xmx36864M','mapreduce.map.java.opts': '-Xmx36864M','mapreduce.reduce.java.opts': '-Xmx36864M','mapreduce.task.timeout': '12000000','yarn.nodemanager.resource.memory-mb': '20480','hive.exec.dynamic.partition': 'true','hive.exec.dynamic.partition.mode': 'nonstrict'},runAsync=True)
[2021-03-13 01:33:12.274 US/Pacific; DEBUG; impala.hiveserver2,1006] Attempting to open transport (tries_left=3)
[2021-03-13 01:33:12.274 US/Pacific; DEBUG; impala.hiveserver2,1008] Transport opened
[2021-03-13 01:33:13.326 US/Pacific; DEBUG; impala.hiveserver2,1034] ExecuteStatement: resp=TExecuteStatementResp(status=TStatus(statusCode=0,operationHandle=TOperationHandle(operationId=THandleIdentifier(guid=b'\xd0*\xba\xe2lQO+\x9c2\xfe\xb8]\xb0\xd0\x94',secret=b'\xa7D\xb8\xa4\x80F@\xca\x8fx\x10\x87\xfe\xa2\xa1\xd8'),operationType=0,hasResultSet=True,modifiedRowCount=None))
[2021-03-13 01:33:13.327 US/Pacific; DEBUG; impala.hiveserver2,386] _execute_async: self._buffer=Batch() self._description=None self._last_operation_active=True self._last_operation=TOperationHandle(operationId=THandleIdentifier(guid=b'\xd0*\xba\xe2lQO+\x9c2\xfe\xb8]\xb0\xd0\x94',modifiedRowCount=None)
[2021-03-13 01:33:13.327 US/Pacific; DEBUG; impala.hiveserver2,330] Waiting for query to finish
[2021-03-13 01:33:13.328 US/Pacific; DEBUG; impala.hiveserver2,1031] GetoperationStatus: req=TGetoperationStatusReq(operationHandle=TOperationHandle(operationId=THandleIdentifier(guid=b'\xd0*\xba\xe2lQO+\x9c2\xfe\xb8]\xb0\xd0\x94',modifiedRowCount=None))
[2021-03-13 01:33:13.328 US/Pacific; DEBUG; impala.hiveserver2,1006] Attempting to open transport (tries_left=3)
[2021-03-13 01:33:13.328 US/Pacific; DEBUG; impala.hiveserver2,1008] Transport opened
[2021-03-13 01:33:13.464 US/Pacific; DEBUG; impala.hiveserver2,1034] GetoperationStatus: resp=TGetoperationStatusResp(status=TStatus(statusCode=0,operationState=2,errorMessage=None,hasResultSet=True)
[2021-03-13 01:33:13.464 US/Pacific; DEBUG; impala.hiveserver2,_wait_to_finish,410] _wait_to_finish: waited 0.13668107986450195 seconds so far
[2021-03-13 01:33:13.465 US/Pacific; DEBUG; impala.hiveserver2,332] Query finished
[2021-03-13 01:33:13.465 US/Pacific; DEBUG; impala.hiveserver2,modifiedRowCount=None))
[2021-03-13 01:33:13.465 US/Pacific; DEBUG; impala.hiveserver2,1006] Attempting to open transport (tries_left=3)
[2021-03-13 01:33:13.466 US/Pacific; DEBUG; impala.hiveserver2,1008] Transport opened
[2021-03-13 01:33:13.559 US/Pacific; DEBUG; impala.hiveserver2,hasResultSet=True)
[2021-03-13 01:33:13.560 US/Pacific; DEBUG; impala.hiveserver2,410] _wait_to_finish: waited 0.09481930732727051 seconds so far
[2021-03-13 01:33:13.560 US/Pacific; DEBUG; impala.hiveserver2,fetchall,534] Fetching all result rows
[2021-03-13 01:33:13.560 US/Pacific; DEBUG; impala.hiveserver2,next,580] next: buffer empty and op is active => fetching more data
[2021-03-13 01:33:13.561 US/Pacific; DEBUG; impala.hiveserver2,description,185] description=None has_result_set=True => getting schema
[2021-03-13 01:33:13.561 US/Pacific; DEBUG; impala.hiveserver2,1031] GetResultSetMetadata: req=TGetResultSetMetadataReq(operationHandle=TOperationHandle(operationId=THandleIdentifier(guid=b'\xd0*\xba\xe2lQO+\x9c2\xfe\xb8]\xb0\xd0\x94',modifiedRowCount=None))
[2021-03-13 01:33:13.561 US/Pacific; DEBUG; impala.hiveserver2,1006] Attempting to open transport (tries_left=3)
[2021-03-13 01:33:13.561 US/Pacific; DEBUG; impala.hiveserver2,1008] Transport opened
[2021-03-13 01:33:13.669 US/Pacific; DEBUG; impala.hiveserver2,1034] GetResultSetMetadata: resp=TGetResultSetMetadataResp(status=TStatus(statusCode=0,schema=TTableSchema(columns=[TColumnDesc(columnName='database_name',typeDesc=TTypeDesc(types=[TTypeEntry(primitiveEntry=TPrimitiveTypeEntry(type=7,typeQualifiers=None),arrayEntry=None,mapEntry=None,structEntry=None,unionEntry=None,userDefinedTypeEntry=None)]),position=1,comment='from deserializer')]))
[2021-03-13 01:33:13.669 US/Pacific; DEBUG; impala.hiveserver2,get_result_schema,1306] get_result_schema: schema=[('database_name','STRING',None,None)]
[2021-03-13 01:33:13.670 US/Pacific; DEBUG; impala.hiveserver2,1031] FetchResults: req=TFetchResultsReq(operationHandle=TOperationHandle(operationId=THandleIdentifier(guid=b'\xd0*\xba\xe2lQO+\x9c2\xfe\xb8]\xb0\xd0\x94',modifiedRowCount=None),orientation=0,maxRows=1024)
[2021-03-13 01:33:13.670 US/Pacific; DEBUG; impala.hiveserver2,1006] Attempting to open transport (tries_left=3)
[2021-03-13 01:33:13.670 US/Pacific; DEBUG; impala.hiveserver2,1008] Transport opened
[2021-03-13 01:38:29.662 US/Pacific; ERROR; impala.hiveserver2,1016] Failed to open transport (tries_left=3)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/impala/hiveserver2.py",line 1010,in _execute
return func(request)
File "/usr/local/lib/python3.7/site-packages/thriftpy2/thrift.py",line 219,in _req
return self._recv(_api)
File "/usr/local/lib/python3.7/site-packages/thriftpy2/thrift.py",line 238,in _recv
result.read(self._iprot)
File "/usr/local/lib/python3.7/site-packages/thriftpy2/thrift.py",line 160,in read
iprot.read_struct(self)
File "/usr/local/lib/python3.7/site-packages/thriftpy2/protocol/binary.py",line 387,in read_struct
return read_struct(self.trans,obj,self.decode_response)
File "/usr/local/lib/python3.7/site-packages/thriftpy2/protocol/binary.py",line 316,in read_struct
read_val(inbuf,f_type,f_container_spec,decode_response))
File "/usr/local/lib/python3.7/site-packages/thriftpy2/protocol/binary.py",line 289,in read_val
read_struct(inbuf,decode_response)
File "/usr/local/lib/python3.7/site-packages/thriftpy2/protocol/binary.py",line 256,in read_val
result.append(read_val(inbuf,v_type,v_spec,line 230,in read_val
byte_payload = inbuf.read(sz)
File "/usr/local/lib/python3.7/site-packages/thrift_sasl/init.py",line 173,in read
self._read_frame()
File "/usr/local/lib/python3.7/site-packages/thrift_sasl/init.py",line 177,in _read_frame
header = self._trans_read_all(4)
File "/usr/local/lib/python3.7/site-packages/thrift_sasl/init.py",line 198,in _trans_read_all
return read_all(sz)
File "/usr/local/lib/python3.7/site-packages/thriftpy2/transport/socket.py",line 132,in read
message='TSocket read 0 bytes')
thriftpy2.transport.base.TTransportException: TTransportException(type=4,message='TSocket read 0 bytes')
[2021-03-13 01:38:29.664 US/Pacific; DEBUG; impala.hiveserver2,1019] Closing transport (tries_left=3)
[2021-03-13 01:38:29.664 US/Pacific; DEBUG; impala.hiveserver2,1006] Attempting to open transport (tries_left=2)
[2021-03-13 01:38:29.897 US/Pacific; DEBUG; impala.hiveserver2,1008] Transport opened
[2021-03-13 01:38:30.077 US/Pacific; DEBUG; impala.hiveserver2,1034] FetchResults: resp=TFetchResultsResp(status=TStatus(statusCode=3,infoMessages=['org.apache.hive.service.cli.HivesqlException:Invalid OperationHandle: OperationHandle [opType=EXECUTE_STATEMENT,getHandleIdentifier()=d02abae2-6c51-4f2b-9c32-feb85db0d094]:12:11','org.apache.hive.service.cli.operation.OperationManager:getoperation:OperationManager.java:193','org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:558','org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:751','org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1717','org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1702','org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39','org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39','org.apache.hive.service.auth.TSetIpAddressprocessor:process:TSetIpAddressprocessor.java:56','org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286','java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149','java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624','java.lang.Thread:run:Thread.java:748'],errorCode=0,errorMessage='Invalid OperationHandle: OperationHandle [opType=EXECUTE_STATEMENT,getHandleIdentifier()=d02abae2-6c51-4f2b-9c32-feb85db0d094]'),hasMoreRows=None,results=None)
[2021-03-13 01:38:30.077 US/Pacific; ERROR; vale.sql._hive,retry_execute,196] unkNown error with impyla --- "impala.error.HiveServer2Error: Invalid OperationHandle: OperationHandle [opType=EXECUTE_STATEMENT,getHandleIdentifier()=d02abae2-6c51-4f2b-9c32-feb85db0d094]"
[2021-03-13 01:38:30.077 US/Pacific; INFO; impala.hiveserver2,close_operation,292] Closing active operation
[2021-03-13 01:38:30.077 US/Pacific; DEBUG; impala.hiveserver2,296] _reset_state: Resetting cursor state
[2021-03-13 01:38:30.077 US/Pacific; DEBUG; impala.hiveserver2,1031] CloSEOperation: req=TCloSEOperationReq(operationHandle=TOperationHandle(operationId=THandleIdentifier(guid=b'\xd0\xba\xe2lQO+\x9c2\xfe\xb8]\xb0\xd0\x94',modifiedRowCount=None))
[2021-03-13 01:38:30.078 US/Pacific; DEBUG; impala.hiveserver2,1006] Attempting to open transport (tries_left=3)
[2021-03-13 01:38:30.078 US/Pacific; DEBUG; impala.hiveserver2,1008] Transport opened
[2021-03-13 01:41:00.712 US/Pacific; ERROR; impala.hiveserver2,1016] Failed to open transport (tries_left=3)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/vale/sql/_sql.py",line 254,in get_connection
yield conn
File "test_sql_hive_debug.py",line 27,in test_write_table
dbs = hive.get_databases()
File "/usr/local/lib/python3.7/site-packages/vale/sql/_hive.py",line 242,in get_databases
z = self.read('SHOW DATABASES').fetchall()
File "/usr/local/lib/python3.7/site-packages/vale/sql/_hive.py",line 239,in fetchall
return retry_execute(self.pyhive_client,super().fetchall)
File "/usr/local/lib/python3.7/site-packages/vale/sql/_hive.py",line 182,in retry_execute
return func(*args,**kwargs)
File "/usr/local/lib/python3.7/site-packages/vale/sql/_sql.py",line 141,in fetchall
return self._cursor.fetchall() # type: ignore
File "/usr/local/lib/python3.7/site-packages/impala/hiveserver2.py",line 536,in fetchall
return list(self)
File "/usr/local/lib/python3.7/site-packages/impala/hiveserver2.py",line 584,in next
convert_types=self.convert_types)
File "/usr/local/lib/python3.7/site-packages/impala/hiveserver2.py",line 1266,in fetch
resp = self._rpc('FetchResults',req)
File "/usr/local/lib/python3.7/site-packages/impala/hiveserver2.py",line 995,in _rpc
err_if_rpc_not_ok(response)
File "/usr/local/lib/python3.7/site-packages/impala/hiveserver2.py",line 749,in err_if_rpc_not_ok
raise HiveServer2Error(resp.status.errorMessage)
impala.error.HiveServer2Error: Invalid OperationHandle: OperationHandle [opType=EXECUTE_STATEMENT,getHandleIdentifier()=d02abae2-6c51-4f2b-9c32-feb85db0d094]

During handling of the above exception,another exception occurred:

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)