java - 如何使用所有必要的 jar 文件在 java 中逐行读取 .doc 文件？

我想逐行显示两个 .doc 文件之间的差异。我已经用 .txt 文件完成了它，并且运行完美。为此，我使用了以下代码:

        FileReader File1Reader = new FileReader(File1.getPath());
        FileReader File2Reader = new FileReader(File2.getPath());

        // Create Buffered Object.
        BufferedReader File1BufRdr = new BufferedReader(File1Reader);
        BufferedReader File2BufRdr = new BufferedReader(File2Reader);

        // Get the file contents into String Variables.
        String File1Content = File1BufRdr.readLine();
        String File2Content = File2BufRdr.readLine();

        //New String Builder
        StringBuilder buffer = new StringBuilder();

有没有办法逐行读取doc文件。我正在使用以下代码从文档文件中读取，但这不是逐行的。这是代码:

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

public class read_From_Doc_Docx {
    public static void main(String[] args) {

            //Alternate between the two to check what works.
        //String FilePath = "D:\\Users\\username\\Desktop\\Doc1.docx";
        String FilePath = "/Users/esna786/Removal of Redundancy.docx";
        FileInputStream fis;

        if (FilePath.substring(FilePath.length() - 1).equals("x")) { //is a docx
            try {
                fis = new FileInputStream(new File(FilePath).getAbsolutePath());
                XWPFDocument doc = new XWPFDocument(fis);
                XWPFWordExtractor extract = new XWPFWordExtractor(doc);
                System.out.println(extract.getText());
            } catch (IOException e) {

                e.printStackTrace();
            }
        } else { //is not a docx
            try {
                fis = new FileInputStream(new File(FilePath));
                HWPFDocument doc = new HWPFDocument(fis);
                WordExtractor extractor = new WordExtractor(doc);
                System.out.println(extractor.getText());
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

最佳答案

只需使用 getParagraphText() 方法而不是 getText()。

关于java - 如何使用所有必要的 jar 文件在 java 中逐行读取 .doc 文件？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27715172/

java - 如何使用所有必要的 jar 文件在 java 中逐行读取 .doc 文件？

上一篇：java - 使用 Libgdx 与 box2d 寻找行为？

下一篇：JavaFX Gradle 构建错误，java.util.zip.ZipException : duplicate entry: META-INF/LICENSE