问题描述
我正在尝试使用PyMuPDF将多页PDF文件转换为图像:
<?PHP
$str = '["userdomain.ltd"],["test.com"]';
$arr = explode(',',$str);
foreach ($arr as $el) {
echo json_decode($el)[0];
echo "\n";
}
但是我需要在多页tiff中将PDF文件的所有页面转换为单个图像,当我给page参数设置页面范围时,它只占用一页,有人知道我能做到吗?
解决方法
import fitz
from PIL import Image
input_pdf = "input.pdf"
output_name = "output.tif"
compression = 'zip' # "zip","lzw","group4" - need binarized image...
zoom = 2 # to increase the resolution
mat = fitz.Matrix(zoom,zoom)
doc = fitz.open(input_pdf)
image_list = []
for page in doc:
pix = page.getPixmap(matrix = mat)
img = Image.frombytes("RGB",[pix.width,pix.height],pix.samples)
image_list.append(img)
if image_list:
image_list[0].save(
output_name,save_all=True,append_images=image_list[1:],compression=compression,dpi=(300,300),)
,
要转换PDF的所有页面时,需要一个for循环。另外,在调用.getPixmap()
时,您需要matrix = mat
之类的属性来基本上提高分辨率。这是代码段(不确定这是否是您想要的,但这会将所有PDF转换为图像):
doc = fitz.open(pdf_file)
zoom = 2 # to increase the resolution
mat = fitz.Matrix(zoom,zoom)
noOfPages = doc.pageCount
image_folder = '/path/to/where/to/save/your/images'
for pageNo in range(noOfPages):
page = doc.loadPage(pageNo) #number of page
pix = page.getPixmap(matrix = mat)
output = image_folder + str(pageNo) + '.jpg' # you could change image format accordingly
pix.writePNG(output)
print('Converting PDFs to Image ... ' + output)
# do your things afterwards
为了解决问题,请从Github上here is a good example进行演示,以了解其含义以及在需要时如何将其用于您的案件。