用pdftk拆分pdf

问题描述

我正在文件夹中的多个pdf中搜索某个关键字。如果找到关键字,我想在特定页面上拆分pdf并将其另存为新的pdf。 代码:

Add-Type -Path '...\itextsharp.5.5.13.1 (1)\lib\itextsharp.dll'
$pdfs = gci "C:\Users\..\Plan\" *.pdf
$keywords = "TEST"
$pdftk = "C:\Program Files (x86)\PDFtk\bin\pdftk.exe"
$output = "C:\Users\...\new"
$newpdf = New-Object -TypeName psobject 
foreach($pdf in $pdfs) {

    Write-Host "processing -" $pdf.FullName

    # prepare the pdf
    $reader = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $pdf.FullName

    # for each page
    for($page = 1; $page -le $reader.NumberOfPages; $page++) {

        # set the page text
        $pageText = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader,$page).Split([char]0x000A)

        # if the page text contains keyword
            if($pageText -match $keywords) {
                break                             
            }
    }

    #$reader.Close()
    $FirstPage = $page
    $LastPage = $reader.NumberOfPages

    Write-Host "Starting page is: " $FirstPage
    Write-Host "Last page is: " $LastPage

    & $pdftk $pdf.FullName cat $FirstPage-end output "$output\test.pdf"
}

解决方法

您缺少output关键字。使用

$pdftk $pdf.FullName cat $FirstPage-end output "$output\test.pdf"

出现此奇怪错误的原因是C被解释为一个句柄,即指向先前输入文件的指针。

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...