java - 使用 PDFBox 从单个 PDF 页面中提取多个嵌入图像

friend 们，我正在使用 PDFBox 2.0.6。我已经成功地从 pdf 文件中提取图像，但现在它正在为单个 pdf 页面创建图像。但问题是可以没有。 pdf 页面中的图像，我希望每个嵌入的图像本身都应提取为单个图像。

这是代码，

import java.awt.image.BufferedImage;
import java.io.File;
import javax.imageio.ImageIO;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;

public class DemoPdf {

    public static void main(String args[]) throws Exception {
        //Loading an existing PDF document
        File file = new File("C:/Users/ADMIN/Downloads/Vehicle_Photographs.pdf");
        PDDocument document = PDDocument.load(file);
        //Instantiating the PDFRenderer class
        PDFRenderer renderer = new PDFRenderer(document);
        File imageFolder = new File("C:/Users/ADMIN/Desktop/image");

        for (int page = 0; page < document.getNumberOfPages(); ++page) {
            //Rendering an image from the PDF document
            BufferedImage image = renderer.renderImage(page);
            //Writing the image to a file
            ImageIO.write(image, "JPEG", new File(imageFolder+"/" + page +".jpg"));
            System.out.println("Image created"+ page);
        }
        //Closing the document
        document.close();
    }

}

在 PDFBox 中我可以将所有嵌入的图像提取为单独的图像吗，谢谢

最佳答案

是的。可以从 pdf 格式的所有页面中提取所有图像。

您可以引用此链接，extract images from pdf using PDFBox .

这里的基本思想是，用 PDFStreamEngine 扩展类，并覆盖 processOperator 方法。为所有页面调用 PDFStreamEngine.processPage。如果传递给 processOperator 的对象是一个 Image 对象，则从该对象中获取 BufferedImage 并保存它。

关于java - 使用 PDFBox 从单个 PDF 页面中提取多个嵌入图像，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45567173/

java - 使用 PDFBox 从单个 PDF 页面中提取多个嵌入图像

上一篇：java - 如何关闭登录请求窗口？

下一篇：java - Eclipse 中的信息图标是什么意思？