我正在尝试学习如何使用 PDFBox 并找到了一些我正在使用的示例代码 here .
我已在后脚本中附加了代码。
当我在Dr. Java中编译代码时,出现以下错误:
File: C:\Users\Dick Hurtz from Hold\Desktop\Java Programs\JavaStuff\PDFManager.java [line: 30]
Error: The constructor org.apache.pdfbox.pdfparser.PDFParser(org.apache.pdfbox.io.RandomAccessFile) is undefined
我不知道该怎么办,任何帮助将不胜感激。感谢大家!
以下是类(class):
主要:
import java.io.IOException;
public class JavaPDFTest {
public static void main(String[] args) throws IOException {
PDFManager pdfManager = new PDFManager();
pdfManager.setFilePath("E:\test.pdf");
System.out.println(pdfManager.ToText());
}
}
PDFManager:
import java.io.File;
import java.io.IOException;
import org.apache.pdfbox.cos.COSDocument;
import org.apache.pdfbox.io.RandomAccessFile;
import org.apache.pdfbox.pdfparser.PDFParser;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
public class PDFManager {
private PDFParser parser;
private PDFTextStripper pdfStripper;
private PDDocument pdDoc;
private COSDocument cosDoc;
private String Text;
private String filePath;
private File file;
public PDFManager() {
}
public String ToText() throws IOException
{
this.pdfStripper = null;
this.pdDoc = null;
this.cosDoc = null;
file = new File(filePath);
parser = new PDFParser(new RandomAccessFile(file,"r")); // update for PDFBox V 2.0
parser.parse();
cosDoc = parser.getDocument();
pdfStripper = new PDFTextStripper();
pdDoc = new PDDocument(cosDoc);
pdDoc.getNumberOfPages();
pdfStripper.setStartPage(1);
pdfStripper.setEndPage(10);
// reading text from page 1 to 10
// if you want to get text from full pdf file use this code
// pdfStripper.setEndPage(pdDoc.getNumberOfPages());
Text = pdfStripper.getText(pdDoc);
return Text;
}
public void setFilePath(String filePath) {
this.filePath = filePath;
}
}
最佳答案
直接使用获取PDDocument
PDDocument pdDoc = PDDocument.load(file);
是从文件加载 PDF 文档的推荐方法。
关于java - 学习PDFBox;示例代码有问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37649831/