java - 获取 PDF Box Reader 中的当前页码

我正在尝试使用 PDF 框阅读器获取当前页面。

听到的是我写的代码。

公共(public)类 PDFTextExtractor{

ArrayList extractText(String fileName) 抛出异常 {

PDDocument document = null;
try {
    document = PDDocument.load( new File(fileName) );
    PDFTextAnalyzer stripper = new PDFTextAnalyzer();
    stripper.setSortByPosition( true );
    stripper.setStartPage( 0 );
    stripper.setEndPage( document.getNumberOfPages() );
    Writer dummy = new OutputStreamWriter(new ByteArrayOutputStream());
    stripper.writeText(document, dummy);
    return stripper.getCharactersList();
}
finally {
    if( document != null ) {
        document.close();
    }
}

}

当我试图获取详细信息时，我正在编写以下代码。

public class PDFTextAnalyzer extends PDFTextStripper {

    public PDFTextAnalyzer() throws IOException {
        super();
        // TODO Auto-generated constructor stub
    }

    private ArrayList<CharInfo> charactersList = new ArrayList<CharInfo>();

    public ArrayList<CharInfo> getCharactersList() {
        return charactersList;
    }

    public void setCharactersList(ArrayList<CharInfo> charactersList) {
        this.charactersList = charactersList;
    }

    @Override
    protected void writeString(String string, List<TextPosition> textPositions)
            throws IOException {

        System.out.println("----->"+document.getPages().getCount());

/*      for(int i = 0 ; i < document.getPages().getCount();i++)
        {
        */
        float docHeight = +document.getPage(1).getMediaBox().getHeight();
        for (TextPosition text : textPositions) {
            /*
             * System.out.println((int)text.getUnicode().charAt(0)+" "+text.
             * getUnicode()+ " [(X=" + text.getXDirAdj()+" "+text.getX() + ",Y="
             * + text.getYDirAdj() + ") height=" + text.getHeightDir() +
             * " width=" + text.getWidthDirAdj() + "]");
             */

            System.out.println("<-->"+text.toString());
            charactersList.add(new CharInfo(
                    text.getUnicode(), 
                    text.getXDirAdj(),
                    docHeight - text.getYDirAdj(),
                    text.getWidthDirAdj(),
                    text.getHeightDir(),
                    text.getFontSizeInPt(),
                    1,     // Page number of current text
                    text.getFont().getFontDescriptor().getFontName(), 
                    text.getFont().getFontDescriptor().getFontFamily()
                )
            );

        }

但我无法获取页码。请参阅行注释“当前文本的页码”。有没有办法获取页码。

最佳答案

怎么样this.getCurrentPageNo() ？

关于java - 获取 PDF Box Reader 中的当前页码，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54863761/

java - 获取 PDF Box Reader 中的当前页码

上一篇：Java "error": "Not Found", "message": "No message available",

下一篇：java - 迭代器中的无限循环