问题描述
我正在使用Python为我所在的游戏社区制作一个不和谐的机器人,现在我正在执行一个返回游戏状态的命令(使用this网站)。该命令的完整代码为here。
但是我也想添加您可以在the website上找到的图形,我使用BeautifulSoup来获取其他值,并且获取图像也很容易。 但是图形不是图像,它是JavaScript / HTML中使用的canvas对象。我不知道数学的工作原理,但是我可以通过右键单击然后复制图像来非常轻松地将其“本地”转换为图像。
我的问题是:如何在我的Python代码中将该画布对象作为图像检索?
当我用Google搜索这个问题时,我大多会得到Tkinter的结果,但实际上没有什么帮助。
解决方法
我不确定您是否要这样做。由于绘图数据是从服务器端预先注入到html中的,因此获取数据的唯一方法是解析脚本并将值转换为Python数据类型(最好是Pandas数据框,以便于绘制)。
我编写了以下凌乱的代码,可能会对您有所帮助。我已经使用PyJSParser来解析脚本。并从中获取变量的值。
我在代码中留下了一些注释。请阅读。
from bs4 import BeautifulSoup as bs
import matplotlib.pyplot as plt
import json
from pyjsparser import parse
import pandas as pd
import matplotlib.pyplot as plt
def parseScript(scriptContent):
res = parse(scriptContent)
df = pd.DataFrame(columns=['timestamp','report'])
# This part is very tricky
# Since the parsing tree is multiple layer deep
# And there is no guarantee that the server won't change the order we have
# to consider traversing all of it to make sure if it is infact what we want.
# comment out the print statements to see what I mean by multi level deep.
# its a rabbit hole.
for obj in res['body']:
if obj['type'] == 'VariableDeclaration':
for declaration in obj['declarations']:
if declaration['type'] == 'VariableDeclarator':
if declaration['id']['name'] == 'data':
# print(declaration.keys())
# print(declaration['type'])
# print(declaration['id'])
# print(declaration['init'].keys())
# print(declaration['init']['type'])
# print(type(declaration['init']['properties']))
for subVar in declaration['init']['properties']:
# print(subVar.keys())
# print(subVar['type'])
# print(subVar['key'])
if subVar['key']['name'] == 'series':
# print(len(subVar['value']))
# print(type(subVar['value']))
# print(subVar['value'].keys())
# print(len(subVar['value']['elements']))
for element in subVar['value']['elements']:
# print(type(element))
# print(element.keys())
# print(element['properties'][0].keys())
timestamp = element['properties'][0]['value']['value']
report = element['properties'][1]['value']['value']
df.loc[len(df)] = [timestamp,report]
return df
def scraper(soup):
# first we must filter the div in which the chart's script reside
# so we don't mistakenly take any other script from the page
chartDiv = soup.find_all('div',attrs={'id': 'chart-row'})
print(len(chartDiv))
scriptContent = chartDiv[0].find_all('script')[0].string
reportData = parseScript(scriptContent)
return reportData
def plotData(df,duration=24):
'''
@param df dataframe gotten from scraped web pages script
@param duration duration in HOUR of which data to plot
'''
import datetime as dt
import pytz
# pre process a bit
# convert timestamp frame into datetime object
df['timstamp'] = pd.to_datetime(df['timestamp'])
# the timezone is fixed from the source
timeZone = pytz.FixedOffset(-240)
df = df[df['timstamp'] >= (dt.datetime.now(timeZone) - dt.timedelta(hours=duration))]
times = pd.to_datetime(df['timestamp'])
df = df.groupby([times.dt.hour])['report'].sum()
df.plot(x = 'timestamp',y = 'report')
plt.show()
if __name__ == '__main__':
with open('lala.html','rb') as file:
soup = bs(file,'html5lib')
data = scraper(soup)
plotData(data)
安装以下库
- html5lib(用于更好的html解析)
- 熊猫
- matplotlib
- pyjsparser
现在对于图形部分,我认为美化取决于您。通过方法scraper()
,您可以获取可用于绘制图形的数据框。
我非常简单地绘制了图表,这可能与您所喜好无关。试试看。