如何阻止Excel工作表覆盖,我希望它在一张工作表中

问题描述

当数据被写入(用于循环)到一个单一的Excel工作表中时,它会覆盖Excel工作表,并且要阻止其覆盖,我需要将收集的较新数据分离到工作表中。 (熊猫)

那我该怎么办?

下面的代码

ih = input('pages: ')
def test():    
    for page in range(1,int(ih)):
        req = requests.get(url + str(page))
        soup = BeautifulSoup(req.content,'html.parser')
        g_data = soup1.find_all('span',{"class": "b-card b-card-mod-h vehicle"})
        g_price = soup.find_all('div',{"class": "b-card--el-vehicle-price"})
        g_mile = soup.find_all('p',{"class": "b-card--el-brief-details"})
        g_name = soup.find_all('p',{"class": "b-card--el-description"})
        g_user = soup.find_all('a',{"class": "b-card--el-agency-title"})
        g_link = soup.find_all('div',{"class": "b-card--el-inner-wrapper"})
        m_price = [item.text for item in g_price]
        m_mile = [item.text for item in g_mile]
        m_user = [item.text for item in g_user]
        m_name = [item.text for item in g_name]
        m_link = [item.a["href"] for item in g_link]
        m_extensions = [('') for item in g_link]
        l1 = m_name
        l2 = m_mile
        l3 = m_price
        l4 = m_user
        l5 = m_link
        l6 = m_extensions
        s1 = pd.Series(l1,name='Vehicle Name')
        s2 = pd.Series(l2,name='Mileage')
        s3 = pd.Series(l3,name='Price')
        s4 = pd.Series(l4,name='User')
        s5 = pd.Series(l5,name='Link')
        s6 = pd.Series(l6,name='Site')
        df = pd.concat([s1,s2,s3,s4,s6+s5],axis=1)
        if(os.path.isfile('hello_world.xlsx')):
            sheet.write(df)
            workbook.close()
        else:
            sheet.write('hello_world.xlsx',index= False)
            workbook.close()
        print(f'[+]Writing Data from page ' + str(page))
        ctypes.windll.kernel32.SetConsoleTitleW('[+]Writing Data from page ' + str(page))
    print('[=]Written Data')
# Write the data.
test()

如果有人可以帮助,谢谢!

解决方法

您可以使用openpyxl获取工作表的最后一行,然后使用数据框to_excel方法将数据写入特定行。请注意,必须设置writer.sheets才能防止在保存之前清除工作簿。

将此方法添加到您的代码中:

def AppendExcel(df,filename):
    import openpyxl
    sheetname = "Sheet1"
    if not os.path.isfile(filename):  # create new file
        df.to_excel(filename,startrow=0,index=False,sheet_name=sheetname) 
    else:  # append
        wb = openpyxl.load_workbook(filename)
        writer = pd.ExcelWriter(filename,engine='openpyxl') 
        writer.book = wb
        writer.sheets = dict((ws.title,ws) for ws in wb.worksheets) # need this to prevent overwrite
        lastrow = wb[sheetname].max_row
        df.to_excel(writer,startrow=lastrow,header=False,sheet_name=sheetname) 
        writer.save()

与此:

AppendExcel(df,'hello_world.xlsx')

此代码未经测试,因此您可能需要对其进行一些调整。