问题描述
所以基本上对我的共享点的身份验证是成功的,但是 Pandas 无法读取 xlsx 文件(存储为字节对象)。
我收到错误: “ValueError:文件不是可识别的excel文件”
代码:
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File
import io
import pandas as pd
#target url taken from sharepoint and credentials
url = 'https://**[company-name]**-my.sharepoint.com/:x:/p/**[email-prefix]**/EYSZCv_Su0tBkarOa5ggMfsB-5DAB-FY8a0-IKukCIaPOw?e=iW2K6r' # this is just the link you get when clicking "copy link" on sharepoint
username = '...'
password = '...'
ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username,password):
ctx = ClientContext(url,ctx_auth)
web = ctx.web
ctx.load(web)
ctx.execute_query()
print("Authentication successful")
response = File.open_binary(ctx,url)
#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start
#read excel file and each sheet into pandas dataframe
df = pd.read_excel(bytes_file_obj)
df
对这里可能出现的问题有什么想法吗?
解决方法
我也遇到了同样的错误(&到达了这个页面)。
我可以解决这个问题,改变网址链接。
使用文件路径(从打开的 excel 文件的“复制路径”中获得),也许它会起作用...
示例:
url = 'https://**[company-name]**-my.sharepoint.com/personal/**[email-prefix]**/Documents/filename.xlsx?web=1'