问题描述
我正在使用 github 项目将网页下载为 html 并另存为 pdf。
我正在调用使用 phantom js 来执行此操作的 javascript 文件。其他页面工作正常,但我在 atlassian 上的目标网址不起作用。我得到一个没有数据的空白 pdf,当我另存为 html 时,它是一个空白网页
有人知道我为什么或如何解决这个问题吗?
目标网址是 https://blackdogs.atlassian.net/wiki/spaces/DOR/overview/
回购链接是 https://github.com/morninj/web2pdf
screenhot.js
var page = require('webpage').create()
var system = require('system')
var address = system.args[1];
var output = system.args[2];
var fs = require('fs');
page.viewportSize = { width: 1280,height: 702 }; // Default on my 13" MacBook
page.customHeaders = {
"Connection": "keep-alive"
};
page.open(address,function (status) {
if (status !== 'success') {
console.log('Unable to load the address!');
fs.write('1.html',page.content,'w');
phantom.exit();
} else {
window.setTimeout(function ()
page.render(output);
fs.write('1.html','w');
phantom.exit();
},10000);
}
});
web2pdf.py
#!/usr/bin/env python
from subprocess import call
import argparse
import sys
import os
import time
parser = argparse.ArgumentParser()
parser.add_argument('-u','--url',help='The URL of a single page ' \
+ 'to download.')
parser.add_argument('-f','--filename',help='The name of a file containing ' \
+ 'multiple URLs to download. Put each URL on a new line.')
parser.add_argument('-o','--output',help='If you\'re archiving just one ' \
+ 'page,this is the name of the output file. The default is ' \
+ 'archive.pdf.')
parser.add_argument('-d','--directory',help='The name of the directory ' \
+ 'to store multiple archives. The default is "archives."')
args = parser.parse_args()
def web2pdf():
if args.url and not args.filename:
# Save a single URL
output_filename = 'archive.pdf'
if args.output: output_filename = args.output
time.sleep(4)
make_screenshot(args.url,output_filename)
elif args.filename and not args.url:
# Save multiple URLs
# Create the archives directory
archives_directory = 'archives'
if args.directory: archives_directory = args.directory
call(['mkdir',archives_directory])
# Process each line in the input file
with open(args.filename) as f:
counter = 0
for line in f:
print ('Archiving %s...' % line.strip())
# Generate filenames: 01.pdf,02.pdf,...,99.pdf
counter = counter + 1
output_filename = str(counter)
if counter < 10: output_filename = '0' + output_filename
output_filename = archives_directory + '/' \
+ output_filename + '.pdf'
make_screenshot(line.strip(),output_filename)
else:
# No URL or list of URLs provided
print ('Please give either a URL or a filename containing a list of URLs.')
sys.exit()
print ('Done.')
def make_screenshot(url,filename):
# Call PhantomJS and suppress its output
with open(os.devnull,"w") as fnull:
call(
['phantomjs',os.path.dirname(os.path.abspath(__file__)) + '/screenshot.js',url,filename],stdout=fnull,stde
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)