java - 如何根据给定的间隔分离pdf

标签 java pdfbox

我的主要意图是我的代码应该将 pdf 和间隔作为输入,我将样本输入作为 2,6,我的程序应该将 pdf 分成 3 个部分,即 1,2 页作为 1pdf。 3、5、6 与其他 pdf 和剩余页面合并为一个 pdf(如果有任何额外页面)。我没有得到所需的输出!这是我写的代码

import org.apache.pdfbox.multipdf.Splitter;
import org.apache.pdfbox.pdmodel.PDDocument;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class PDFSplitter {

    public List<String> split(String fileName, String password, int[] splitIndices) throws IOException {

        //Loading an existing PDF document
        File file = new File(fileName);
        PDDocument document = null;
        if (password != null && !password.trim().equals("")) {
            document = PDDocument.load(file, password);
        } else {
            document = PDDocument.load(file);
        }

        //Instantiating Splitter class

        //splitting the pages of a PDF document

        List<PDDocument> splitDocs = new ArrayList<PDDocument>();
        int startPage = 0;
        for (int splitIdex : splitIndices) {
            Splitter splitter = new Splitter();
            splitter.setStartPage(startPage);
            splitter.setSplitAtPage(splitIdex +startPage);
            splitter.setEndPage(splitIdex+1);
            List<PDDocument> documents = splitter.split(document);
            splitDocs.addAll(documents);
            startPage = splitIdex + 1;
        }
        if(startPage <= document.getNumberOfPages())
        {
            Splitter splitter = new Splitter();
            splitter.setStartPage(startPage);
            splitter.setSplitAtPage(document.getNumberOfPages() - startPage);
            splitter.setEndPage(document.getNumberOfPages());
            List<PDDocument> documents = splitter.split(document);
            splitDocs.addAll(documents);
        }

        List<String> splitFileNames = new ArrayList<String>();
        for (PDDocument splitDoc : splitDocs) {
            String fileName1 = fileName.substring(0, fileName.indexOf(".PDF")) + splitDocs.indexOf(splitDoc) + ".pdf";
            splitDoc.save(fileName1);
            splitFileNames.add(fileName1);
        }
        document.close();
        return splitFileNames;
    }

    public static void main(String[] args) throws IOException {
        PDFSplitter splitter = new PDFSplitter();
        int[] pages = {3,5};
        List<String> splitFileNames = splitter.split("C:\\Users\\RSk\\Desktop\\rsk.pdf","", pages);

        System.out.println("splitFileNames = " + splitFileNames);
    }
}

最佳答案

My main intention is my code should take pdf and intervals as input, I'm taking sample inputs as 2,6, where my program should divide pdf into 3 parts, i.e. 1,2 pages as one pdf. 3,4,5,6 as other pdf and remaining pages into one pdf (if there is any extra page).

通过覆盖 splitAtPage 方法自定义 Splitter 最容易做到这一点:

public class CustomSplitter extends Splitter {
    public CustomSplitter(int[] splitIndices) {
        this.splitIndices = splitIndices;
    }

    @Override
    protected boolean splitAtPage(int pageNumber) {
        return Arrays.binarySearch(splitIndices, pageNumber) >= 0;
    }

    final int[] splitIndices;
}

(CustomSplitter 类)

现在您可以像这样在给定页面拆分文档:

PDDocument document = PDDocument.load(SOURCE);
Splitter splitter = new CustomSplitter(new int[] {2,6});

List<PDDocument> documents = splitter.split(document);

for (int i=0; i < documents.size(); i++) {
    documents.get(i).save(String.format("result-%d.pdf", i));
}

( TestCustomSplitter 测试 testSplitForSaiKrishna)

关于java - 如何根据给定的间隔分离pdf,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58345483/

相关文章:

java - PDFBox - 如何创建目录

java - PdfBox 将字体嵌入到现有文档中

java - PDF框 - “no main manifest attribute”

java - 如何在jsf中使用 session 过滤器作为登录表单

java - 来自 Eclipse 的 Spring Boot maven 项目在 Intellij 中不起作用

java - vaadin 14 中复制到剪贴板组件

java - PDFBOX打印问题: Printed PDF unexpectedly rotated 90 degrees and incorrect size

java - 单击网站按钮时 Android WebView 停止工作

java - 在子类中创建复合主键 (JPA)

Java PDFBOX 文本编码