问题描述
我无法更改此代码
%matplotlib inline
from collections import Counter,defaultdict,OrderedDict
from bs4 import BeautifulSoup
import os
from tqdm import tqdm_notebook
import glob
import nltk
import zipfile
import math
import pandas as pd
import sys
import itertools
def loadShakespeare():
if 'shaks200.zip' in os.listdir():
return 'shaks200.zip'
elif os.path.exists('../../data/Week1/'):
return '../../data/Week1/shaks200.zip'
elif os.path.exists('../../../data/Week1/'):
return '../../../data/Week1/shaks200.zip'
我可以更改以下代码
def index_collection(shaks200):
# With zipfile we can read the file without opening the zip file
archive = zipfile.ZipFile('shaks200.zip','r')
namelist = [x for x in archive.namelist() if '.xml' in x]
MyIndex = defaultdict(lambda: defaultdict(int)) # initialize MyIndex
for infile in tqdm_notebook(namelist): # loop over each file
f = archive.open(infile)
return MyIndex
%time Shakespeare = index_collection(loadShakespeare())
Shakespeare['the'],Shakespeare['witch']
这段代码给了我 FileNotFoundError:[错误2]没有这样的文件或目录:'shaks200.zip'
文件的位置是 C:\ Users \ joris \ Desktop \ Zoekmachines \ IR0_2020_Student_Repo \ IR0_2020_Student_Repo \ Data \ Week1
解决方法
我认为您需要传递文件的绝对路径。如果您未传递文件的绝对路径,则python会假定您正在当前目录中查找文件,我猜这是造成此问题的原因。