java - 使用 "Microsoft Print to PDF"和 Java 将文档转换为 PDF

标签 java pdf file-conversion

我目前正在 Microsoft Windows 主机上测试将 RTF/DOC 文档转换为 PDF。我有一段使用 Microsoft Word API 的工作和平代码,但由于许可成本,我想摆脱它。

我的想法是,只需将 RTF“发送”到 Microsoft Print To PDF 打印机,就可以将 RTF 转换为 PDF。

我遇到的问题是,我一方面可以访问打印机,而且还可以获得输出,但文件已损坏。

如果我只是将生成的文件从 .pdf 重命名为 .rtf 并在 Microsoft Word 中打开它,内容将如下所示(它只是整个内容的摘录):

\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff1\deff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deflang1031\deflangfe1031\themelang1031\themelangfe0\themelangcs0{\fonttbl{\f0\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\f1\fbidi \fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial{\*\falt Arial};}{\f2\fbidi \fmodern\fcharset0\fprq1{\*\panose 02070309020205020404}Courier New{\*\falt ?l?r ???fc};}
{\f3\fbidi \froman\fcharset2\fprq2{\*\panose 05050102010706020507}Symbol{\*\falt Symbol};}{\f10\fbidi \fnil\fcharset2\fprq2{\*\panose 05000000000000000000}Wingdings;}{\f34\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria Math;}
{\f38\fbidi \fswiss\fcharset0\fprq2{\*\panose 020b0604030504040204}Tahoma;}{\f39\fbidi \fswiss\fcharset0\fprq2{\*\panose 00000000000000000000}Arial Black;}
{\flomajor\f31500\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}{\fdbmajor\f31501\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\fhimajor\f31502\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria;}{\fbimajor\f31503\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\flominor\f31504\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}{\fdbminor\f31505\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\fhiminor\f31506\fbidi \fswiss\fcharset0\fprq2{\*\panose 020f0502020204030204}Calibri;}{\fbiminor\f31507\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\f40\fbidi \froman\fcharset238\fprq2 Times New Roman CE{\*\falt Courier New};}{\f41\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr{\*\falt Courier New};}{\f43\fbidi \froman\fcharset161\fprq2 Times New Roman Greek{\*\falt Courier New};}
{\f44\fbidi \froman\fcharset162\fprq2 Times New Roman Tur{\*\falt Courier New};}{\f45\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew){\*\falt Courier New};}{\f46\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic){\*\falt Courier New};}
{\f47\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic{\*\falt Courier New};}{\f48\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese){\*\falt Courier New};}{\f50\fbidi \fswiss\fcharset238\fprq2 Arial CE{\*\falt Arial};}
{\f51\fbidi \fswiss\fcharset204\fprq2 Arial Cyr{\*\falt Arial};}{\f53\fbidi \fswiss\fcharset161\fprq2 Arial Greek{\*\falt Arial};}{\f54\fbidi \fswiss\fcharset162\fprq2 Arial Tur{\*\falt Arial};}
{\f55\fbidi \fswiss\fcharset177\fprq2 Arial (Hebrew){\*\falt Arial};}{\f56\fbidi \fswiss\fcharset178\fprq2 Arial (Arabic){\*\falt Arial};}{\f57\fbidi \fswiss\fcharset186\fprq2 Arial Baltic{\*\falt Arial};}
{\f58\fbidi \fswiss\fcharset163\fprq2 Arial (Vietnamese){\*\falt Arial};}{\f60\fbidi \fmodern\fcharset238\fprq1 Courier New CE{\*\falt ?l?r ???fc};}{\f61\fbidi \fmodern\fcharset204\fprq1 Courier New Cyr{\*\falt ?l?r ???fc};}
{\f63\fbidi \fmodern\fcharset161\fprq1 Courier New Greek{\*\falt ?l?r ???fc};}{\f64\fbidi \fmodern\fcharset162\fprq1 Courier New Tur{\*\falt ?l?r ???fc};}{\f65\fbidi \fmodern\fcharset177\fprq1 Courier New (Hebrew){\*\falt ?l?r ???fc};}
{\f66\fbidi \fmodern\fcharset178\fprq1 Courier New (Arabic){\*\falt ?l?r ???fc};}{\f67\fbidi \fmodern\fcharset186\fprq1 Courier New Baltic{\*\falt ?l?r ???fc};}{\f68\fbidi \fmodern\fcharset163\fprq1 Courier New (Vietnamese){\*\falt ?l?r ???fc};}
{\f380\fbidi \froman\fcharset238\fprq2 Cambria Math CE;}{\f381\fbidi \froman\fcharset204\fprq2 Cambria Math Cyr;}{\f383\fbidi \froman\fcharset161\fprq2 Cambria Math Greek;}{\f384\fbidi \froman\fcharset162\fprq2 Cambria Math Tur;}
{\f387\fbidi \froman\fcharset186\fprq2 Cambria Math Baltic;}{\f388\fbidi \froman\fcharset163\fprq2 Cambria Math (Vietnamese);}{\f420\fbidi \fswiss\fcharset238\fprq2 Tahoma CE;}{\f421\fbidi \fswiss\fcharset204\fprq2 Tahoma Cyr;}
{\f423\fbidi \fswiss\fcharset161\fprq2 Tahoma Greek;}{\f424\fbidi \fswiss\fcharset162\fprq2 Tahoma Tur;}{\f425\fbidi \fswiss\fcharset177\fprq2 Tahoma (Hebrew);}{\f426\fbidi \fswiss\fcharset178\fprq2 Tahoma (Arabic);}
{\f427\fbidi \fswiss\fcharset186\fprq2 Tahoma Baltic;}{\f428\fbidi \fswiss\fcharset163\fprq2 Tahoma (Vietnamese);}{\f429\fbidi \fswiss\fcharset222\fprq2 Tahoma (Thai);}{\f430\fbidi \fswiss\fcharset238\fprq2 Arial Black CE;}
{\f431\fbidi \fswiss\fcharset204\fprq2 Arial Black Cyr;}{\f433\fbidi \fswiss\fcharset161\fprq2 Arial Black Greek;}{\f434\fbidi \fswiss\fcharset162\fprq2 Arial Black Tur;}{\f437\fbidi \fswiss\fcharset186\fprq2 Arial Black Baltic;}
{\flomajor\f31508\fbidi \froman\fcharset238\fprq2 Times New Roman CE{\*\falt Courier New};}{\flomajor\f31509\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr{\*\falt Courier New};}
{\flomajor\f31511\fbidi \froman\fcharset161\fprq2 Times New Roman Greek{\*\falt Courier New};}{\flomajor\f31512\fbidi \froman\fcharset162\fprq2 Times New Roman Tur{\*\falt Courier New};}
{\flomajor\f31513\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew){\*\falt Courier New};}{\flomajor\f31514\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic){\*\falt Courier New};}
{\flomajor\f31515\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic{\*\falt Courier New};}{\flomajor\f31516\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese){\*\falt Courier New};}
{\fdbmajor\f31518\fbidi \froman\fcharset238\fprq2 Times New Roman CE{\*\falt Courier New};}{\fdbmajor\f31519\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr{\*\falt Courier New};}
{\fdbmajor\f31521\fbidi \froman\fcharset161\fprq2 Times New Roman Greek{\*\falt Courier New};}{\fdbmajor\f31522\fbidi \froman\fcharset162\fprq2 Times New Roman Tur{\*\falt Courier New};}
{\fdbmajor\f31523\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew){\*\falt Courier New};}{\fdbmajor\f31524\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic){\*\falt Courier New};}
{\fdbmajor\f31525\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic{\*\falt Courier New};}{\fdbmajor\f31526\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese){\*\falt Courier New};}
{\fhimajor\f31528\fbidi \froman\fcharset238\fprq2 Cambria CE;}{\fhimajor\f31529\fbidi \froman\fcharset204\fprq2 Cambria Cyr;}{\fhimajor\f31531\fbidi \froman\fcharset161\fprq2 Cambria Greek;}{\fhimajor\f31532\fbidi \froman\fcharset162\fprq2 Cambria Tur;}
{\fhimajor\f31535\fbidi \froman\fcharset186\fprq2 Cambria Baltic;}{\fhimajor\f31536\fbidi \froman\fcharset163\fprq2 Cambria (Vietnamese);}{\fbimajor\f31538\fbidi \froman\fcharset238\fprq2 Times New Roman CE{\*\falt Courier New};}

我假设我没有正确读取该文件。或者可能写得不正确?没有把握。 也许缺少一个属性 - 我只是认为这是错误的一件小事。

我有以下代码:

import javax.print.Doc;
import javax.print.DocFlavor;
import javax.print.DocPrintJob;
import javax.print.PrintService;
import javax.print.PrintServiceLookup;
import javax.print.SimpleDoc;
import javax.print.attribute.HashPrintRequestAttributeSet;
import javax.print.attribute.PrintRequestAttributeSet;
import javax.print.attribute.standard.Copies;
import javax.print.attribute.standard.Destination;
import javax.print.event.PrintJobAdapter;
import javax.print.event.PrintJobEvent;
import java.io.File;
import java.io.FileInputStream;
import java.net.URISyntaxException;
import java.util.Arrays;


public class Program {


    public static final String myPath = "C:/Project Files/Template";
    public static final String myFile =  "CreditNoteEnglish.rtf";
    public static final String myFile2 =  "CreditNoteEnglish.pdf";
    public static void  main (String[] args)
    {
        try {
            convertToPDF_PerPrint(myPath, myFile);
        } catch (URISyntaxException e) {
            e.printStackTrace();
        }

    }

    private static void convertToPDF_PerPrint( String Verzeichnis,  String pFileName) throws URISyntaxException {
        final String defaultPrinterName = "Microsoft Print To PDF";
        DocFlavor docType = DocFlavor.INPUT_STREAM.AUTOSENSE;
        PrintRequestAttributeSet printerSettings = new HashPrintRequestAttributeSet();
        PrintService PDFPrinter = null;
        File myFile = new File(Verzeichnis + "/" + pFileName);
        File outFile = new File (Verzeichnis + "/" + myFile2);
       // printerSettings.add(MediaSizeName.ISO_A4);
        printerSettings.add(new Destination(outFile.toURI()));
        printerSettings.add(new Copies(1));
        PrintService[] printServices = PrintServiceLookup.lookupPrintServices(docType,printerSettings);
        try
        {
            if(printServices.length == 0)
            {
                throw new Exception("No printers found for given attributes");
            }
            System.out.println ( "Available printers: " + Arrays.asList ( printServices ) );

            for(PrintService availableService : printServices)
            {
                if(availableService.getName().contains("PDF"))
                {
                    PDFPrinter = availableService;
                    break;
                }
            }            if (PDFPrinter == null)
        {
            throw new IllegalStateException("Can not find PDF printer.");
        }

            FileInputStream fileAsStream = new FileInputStream(myFile);

            System.out.println ( Verzeichnis + "\\" + pFileName );
            System.out.println ( fileAsStream.read() );
            DocPrintJob myPrintJob =  PDFPrinter.createPrintJob();
            Doc myConvertableFile = new SimpleDoc(fileAsStream, DocFlavor.INPUT_STREAM.AUTOSENSE,null);
            PrintJobWatcher watcher = new PrintJobWatcher(myPrintJob);
            myPrintJob.print(myConvertableFile, printerSettings);
            watcher.waitForDone();
            fileAsStream.close();
        }
        catch(Exception e)
        {
            System.out.println(e);
        }
    }
}

class PrintJobWatcher {

    boolean done = false;

    PrintJobWatcher(DocPrintJob job) {
        job.addPrintJobListener(new PrintJobAdapter() {
            public void printJobCanceled(PrintJobEvent pje) {
                allDone();
            }

            public void printJobCompleted(PrintJobEvent pje) {
                allDone();
            }

            public void printJobFailed(PrintJobEvent pje) {
                allDone();
            }

            public void printJobNoMoreEvents(PrintJobEvent pje) {
                allDone();
            }

            void allDone() {
                synchronized (PrintJobWatcher.this) {
                    done = true;
                    System.out.println("Printing done ...");
                    PrintJobWatcher.this.notify();
                }
            }
        });
    }

    public synchronized void waitForDone() {
        try {
            while (!done) {
                wait();
            }
        } catch (InterruptedException e) {
        }
    }
}

有谁知道为什么在使用上面的代码时无法使 Microsoft Print to PDF Printer 生成正确的 PDF?

任何提示都将受到高度赞赏。

非常感谢。

最佳答案

Windows vanilla print() 是基于文本的,因此通过记事本或其他文本模式打印到 pdf 将简单地生成包含输出的 pdf,就像以文本格式查看文件一样。您不需要所有代码,在记事本中打开 rtf 并打印到 pdf 也会做同样的事情。

对于打印 html 或 xml(文本),我们可以使用 Edge 做得更好。然而,对于图像,我们需要通过绘画/图像应用程序进行路由,对于 MS 文档,我们需要一个 MS 文档处理程序。

到目前为止,将 MS RTF/Doc/DocX/Odt 转换为 MS PDF(无需库)的最简单方法是通过已获得许可的 shell 应用程序“Write” 或“WordPad”,但存在一些限制RTF 必须与写字板兼容,并且不会抛出“不支持某些功能”的消息。注意表格必须具有要打印的边框宽度,因为图像中的可见线条和透明度可能会产生奇怪的结果,因此最好保持简单。很少支持背景图像。保持 RTF 极其简单,就像用纯文本行编辑器或批处理文件编写一样。该页面将是当前的 MS PDF 默认值(此处使用以前的 A4 横向)除非您使用 PrintUI 预先调整方向或格式。

避免尝试将图像手写为富文本:-)这是可能的,但生命周期还不够长,以下是此示例 RTF 的前几行:-

{\rtf1\ansi\ansicpg1252\deff0\nouicompat

{\fonttbl
{\f0\fnil\fcharset0 Calibri;}
{\f1\froman\fcharset0 Times New Roman;}
{\f2\fnil\fcharset0 Segoe UI;}
}

{\colortbl ;\red255\green0\blue0;}
{\*\generator Riched20 10.0.19041}

\viewkind4\uc1
\pard\sa200\sl240\slmult1\f0\fs22\lang9

{\pict{\*\picprop}\wmetafile8\picw24818\pich5001\picwgoal3495\pichgoal705 
010009000003a6ae000000007dae000000000400000003010800050000000b0200000000050000
000c02bd00aa03030000001e00040000000701040004000000070104007dae0000410b2000cc00
bd00aa0300000000bd00aa030000000028000000aa030000bd0000000100040000000000000000
000000000000000000000000000000000000000000ffffff00ccffff003399ff00c0c0c0008080
800033333300000000000000000000000000000000000000000000000000000000000000000000
000000111111111111111111111111111111111111111111111111111111111111111111111111
111111111111111111111111111111111111111111111111111111111111111111111111111111
111111111111111111111111111111111111111111111111111111111111111111111111111111

在更简单的表格之前,图像还需要 2290 行文本。

因此使用Word或写字板插入图片或为图像注入(inject)解码的img2rtf。

enter image description here

对于命令行打印,只需对这样的命令进行外壳

write /pt file.doc "printer name" "printer driver name" "port address"
  • 您可以使用写入(通常是 2 个 stub )或写字板(主引擎)
  • /pt 打印到
  • 对于“Microsoft Print to PDF”,打印机和驱动程序具有相同的名称
  • 端口地址可以是远程 pdf 打印机或默认端口提示:或端口文件名

关于java - 使用 "Microsoft Print to PDF"和 Java 将文档转换为 PDF,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70220010/

相关文章:

android - 如何在 Android 应用程序中将任何格式(.txt、.Doc)文件转换为 epub 文件

java - 计算满足特定条件的对象数量

Java 构造函数 - 为什么年龄返回为零?

delphi - ExportAsFixedFormat 的 IgnorePrintAreas 参数好像没有效果

pdf - 任何人都可以为 .Net 推荐一个好的 PDF 阅读库吗?

c# - 生成自动打印的 PDF

python - 在 Python 中将 scad 文件格式转换为 STL

java - 线程加入与 ExecutorService.awaitTermination

java - run方法中的代码未运行

html - 在 HTML 中嵌入矢量化 PDF 图像