使用Python在HDInsight中读取Hive表时的连接问题

问题描述

全部。我想使用Python连接到HDInsight中的Hive数据库,我关注了多个博客,也很少关注Stackoverflow blogs。但是没有运气。以下是我使用 pyhive JayDeBeApi 库的尝试。

使用JayDeBeApi

我已将hive-jdbc-1.2.1,httpclient-4.4和httpcore-4.4.4 jar添加到当前工作目录中,并且已经使用pip install thrift安装了Thrift。 代码片段是

import jaydebeapi

conn = jaydebeapi.connect("org.apache.hive.jdbc.HiveDriver","jdbc:hive2://shaktiman.database.windows.net:443/;ssl=true;transportMode=http;httpPath=/hive2",['admin','Abcdeertyoiu@1234'],"hive-jdbc-1.2.1.jar")

cursor = conn.cursor()
cursor.execute("select * from default.hivesampletable limit 50")
print(cursor.description)  # prints the result set's schema
results = cursor.fetchall()

但是我遇到了以下错误

Traceback (most recent call last):
  File "ClassLoader.java",line 357,in java.lang.classLoader.loadClass
  File "Launcher.java",line 349,in sun.misc.Launcher$AppClassLoader.loadClass
  File "ClassLoader.java",line 424,in java.lang.classLoader.loadClass
  File "urlclassloader.java",line 382,in java.net.urlclassloader.findClass
java.lang.classNotFoundException: java.lang.classNotFoundException: org.apache.hive.service.cli.thrift.TCLIService$Iface

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "org.jpype.JPypeContext.java",line 330,in org.jpype.JPypeContext.callMethod
  File "Method.java",line 498,in java.lang.reflect.Method.invoke
  File "DelegatingMethodAccessorImpl.java",line 43,in sun.reflect.DelegatingMethodAccessorImpl.invoke
  File "NativeMethodAccessorImpl.java",line 62,in sun.reflect.NativeMethodAccessorImpl.invoke
  File "NativeMethodAccessorImpl.java",line -2,in sun.reflect.NativeMethodAccessorImpl.invoke0
  File "DriverManager.java",line 247,in java.sql.DriverManager.getConnection
  File "DriverManager.java",line 664,in java.sql.DriverManager.getConnection
  File "HiveDriver.java",line 105,in org.apache.hive.jdbc.HiveDriver.connect
Exception: Java Exception

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test.py",line 39,in <module>
    "hive-jdbc-1.2.1.jar")
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\jaydebeapi\__init__.py",line 412,in connect
    jconn = _jdbc_connect(jclassname,url,driver_args,jars,libs)
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\jaydebeapi\__init__.py",line 230,in _jdbc_connect_jpype
    return jpype.java.sql.DriverManager.getConnection(url,*dargs)
java.lang.NoClassDefFoundError: java.lang.NoClassDefFoundError: org/apache/hive/service/cli/thrift/TCLIService$Iface

不确定,这是什么问题。

我也尝试过使用PyHive,如下所示

from pyhive import hive
conn = hive.connect('hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net',port=10000,auth='NOSASL')
cursor = conn.cursor()
cursor.execute('SHOW DATABASES')
cursor.fetchall()

但是我仍然有问题:

"D:\Learning Dir\PycharmProjects\Python\venv\Scripts\python.exe" "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test2.py"
Failed to resolve sockaddr for hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net:10000
Traceback (most recent call last):
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py",line 99,in open
    addrs = self._resolveAddr()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py",line 42,in _resolveAddr
    socket.AI_PASSIVE | socket.AI_ADDRCONfig)
  File "D:\Installation\Python\python38-32\lib\socket.py",line 752,in getaddrinfo
    for res in _socket.getaddrinfo(host,port,family,type,proto,flags):
socket.gaierror: [Errno 11001] getaddrinfo Failed
Traceback (most recent call last):
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py",flags):
socket.gaierror: [Errno 11001] getaddrinfo Failed

During handling of the above exception,another exception occurred:

Traceback (most recent call last):
  File "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test2.py",line 2,in <module>
    conn = hive.connect('hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net',auth='NOSASL')
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\pyhive\hive.py",line 94,in connect
    return Connection(*args,**kwargs)
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\pyhive\hive.py",line 192,in __init__
    self._transport.open()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TTransport.py",line 155,in open
    return self.__trans.open()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py",line 103,in open
    raise TTransportException(type=TTransportException.NOT_OPEN,message=msg,inner=gai)
thrift.transport.TTransport.TTransportException: Failed to resolve sockaddr for hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net:10000

此外,很少有博客建议将hiveserver2传输模式从“ http”更改为“ binary”。试过了。但这对我也没有帮助...

如果有人可以提出一些可行的代码解决方案,我将不胜感激。 提前谢谢。

解决方法

在我看来配置/网络问题。

  1. 您可以验证从主机(正在提交应用程序的主机)到HDI集群的连接(可以忽略是否从HDI的头节点提交)。尝试在此处使用ip地址-{{1} }。您可以通过在HDI群集中运行hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net来获取IP地址。
  2. 也可以尝试使用curl ifconfig.me检查端口在任何地方都没有使用。尝试使用10001
  3. 尝试在Ambari中将值telnethive.server2.transport.mode更改为http

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...