执行SQL文件，返回结果为Pandas DataFrame SQL示例使用pyodbc和熊猫执行查询结果-完整堆栈跟踪查询为字符串

问题描述

我有一个想从Python执行的复杂sql Server查询，并将结果作为Pandas DataFrame返回。

我的数据库是只读的，因此我没有太多选择，就像其他答案一样，它们使查询变得不太复杂。

This answer was helpful，但我不断收到TypeError: 'nonetype' object is not iterable

sql示例

这不是真正的查询-只是为了说明我有临时表。使用全局临时表，因为先前使用本地临时表进行查询时我的查询失败：See this question

SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

IF OBJECT_ID('tempdb..##temptable') IS NOT NULL DROP TABLE ##temptable
IF OBJECT_ID('tempdb..##results') IS NOT NULL DROP TABLE ##results

DECLARE @closing_period int = 0,@starting_period int = 0

Select col1,col2,col3 into ##temptable from readonlytables

Select * into ##results from ##temptable

Select * from ##results

使用pyodbc和熊猫执行查询

conn = pyodbc.connect('db connection details')

sql = open('myquery.sql','r')
df = read_sql_query(sql.read(),conn)
sql.close()
conn.close()

结果-完整堆栈跟踪

ypeError                                 Traceback (most recent call last)
<ipython-input-38-4fcfe4123667> in <module>
      5 
      6 sql = open('sql/month_end_close_hp.sql','r')
----> 7 df = pd.read_sql_query(sql.read(),conn)
      8 #sql.close()
      9 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py in read_sql_query(sql,con,index_col,coerce_float,params,parse_dates,chunksize)
    330         coerce_float=coerce_float,331         parse_dates=parse_dates,--> 332         chunksize=chunksize,333     )
    334 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py in read_query(self,sql,chunksize)
   1632         args = _convert_params(sql,params)
   1633         cursor = self.execute(*args)
-> 1634         columns = [col_desc[0] for col_desc in cursor.description]
   1635 
   1636         if chunksize is not None:

TypeError: 'nonetype' object is not iterable

当我在数据库中运行查询时，得到了预期的结果。如果我以字符串形式传递查询，我也会得到预期的结果：

查询为字符串

conn = pyodbc.connect('db connection details')

sql = '''
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

IF OBJECT_ID('tempdb..##temptable') IS NOT NULL DROP TABLE ##temptable
IF OBJECT_ID('tempdb..##results') IS NOT NULL DROP TABLE ##results

DECLARE @closing_period int = 0,col3 into ##temptable from readonlytables

Select * into ##results from ##temptable

Select * from ##results
'''

df = read_sql(sql,conn)

conn.close()

我认为这可能与查询中的单引号有关？

解决方法

我开始工作了。

我不得不通过将@替换为@@来使用全局变量，这样才能使查询按预期方式工作。

DECLARE @@closing_period int = 0,@@starting_period int = 0

更新：我的ODBC驱动程序已经过时了-更新到最新版本后，我不再需要全局临时表或变量了-查询的运行速度明显加快。

pandas pyodbc python sql-server