java - 使用 Apache POI 从 PowerPoint 幻灯片中删除图表

标签 java apache-poi powerpoint

我们正在尝试使用 Apache POI 3.16 从 PowerPoint 幻灯片中删除图表,但遇到了困难。

我们的代码执行以下步骤:

  1. 打开现有 PowerPoint 文档(模板文档)
  2. 添加和删除幻灯片
  3. 更新现有幻灯片中的图表

这很好用。

有时,我们需要从给定幻灯片中删除图表。这是我们的尝试:

OPCPackage pkg = ppt.getPackage();

String chartRelationId = slide.getRelationId(chart);
pkg.removeRelationship(chartRelationId);

pkg.removePart(chart.getPackagePart());

pkg.removePart()调用似乎有效,但将最终的 PowerPoint 文档写入磁盘失败,并出现异常,提示无法删除零件文件(可能是因为我们已经删除了它)。

pkg.removeRelationship()在将文档写入磁盘期间,调用还会触发异常,显示 core.xml已经存在。

是否可以使用 Apache POI 从 PowerPoint 幻灯片中删除图表?如果是这样,怎么办?

最佳答案

由于 XSLFChart 处于 @Beta 状态,因此到目前为止,图表还没有明确的 Shape。因此,使用apache poi我们只能获取包含图表的XSLFGraphicFrame。但是从幻灯片中删除 XSLFGraphicFrame 也不会删除所有相关的图表部分。因此,自上而下删除相关图表部分,即从POIXMLDocumentPart级别向下到PackagePart级别至今尚未实现。由于 POIXMLDocumentPart 中的所有相关方法都受到保护,并且 XSLFChart 本身是最终的,因此实际上并不容易解决。

下面的代码显示了这个问题。是这样评论的。

该代码会删除第一张幻灯片中的所有图表,并删除所有关系和相关部分:/ppt/embeddings/Microsoft_Excel_WorksheetN.xlsx/ppt/charts/colorsN.xml /ppt/charts/styleN.xml。只有 /ppt/charts/chartN.xml 无法删除,因为它已被注释。

import java.io.FileInputStream;
import java.io.FileOutputStream;

import org.apache.poi.xslf.usermodel.*;
import org.apache.poi.sl.usermodel.*;

import org.apache.poi.POIXMLDocumentPart;

import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackagePart;
import org.apache.poi.openxml4j.opc.PackageRelationshipCollection;
import org.apache.poi.openxml4j.opc.PackageRelationship;

import org.apache.xmlbeans.XmlObject;

import java.util.Map;
import java.util.HashMap;

import java.util.regex.Pattern;

public class ReadPPTRemoveChart {

 public static void main(String[] args) throws Exception {

  XMLSlideShow slideShow = new XMLSlideShow(new FileInputStream("PPTWithCharts.pptx"));

  XSLFSlide slide = slideShow.getSlides().get(0);

  Map<String, XSLFGraphicFrame> chartFramesToRemove = new HashMap<>();

  for (XSLFShape shape : slide.getShapes()) {
   if (shape instanceof XSLFGraphicFrame) {
    XSLFGraphicFrame graphicframe = (XSLFGraphicFrame)shape;
    XmlObject xmlobject = graphicframe.getXmlObject();
    XmlObject[] graphics = xmlobject.selectPath(
                            "declare namespace a='http://schemas.openxmlformats.org/drawingml/2006/main' " +
                            ".//a:graphic");
    if (graphics.length > 0) { //we have a XSLFGraphicFrame containing a:graphic
     XmlObject graphic = graphics[0];
     XmlObject[] charts = graphic.selectPath(
                           "declare namespace c='http://schemas.openxmlformats.org/drawingml/2006/chart' " +
                           ".//c:chart");
     if (charts.length > 0) { //we have a XSLFGraphicFrame containing c:chart
      XmlObject chart = charts[0];
      String rid = chart.selectAttribute(
                          "http://schemas.openxmlformats.org/officeDocument/2006/relationships", "id")
                          .newCursor().getTextValue();
      chartFramesToRemove.put(rid, graphicframe);
     }
    }
   }
  }

  PackagePart slidepart = slide.getPackagePart();
  OPCPackage opcpackage = slideShow.getPackage();

  for (String rid : chartFramesToRemove.keySet()) {
   //at frist remove the XSLFGraphicFrame
   XSLFGraphicFrame chartFrame = chartFramesToRemove.get(rid);
   slide.removeShape(chartFrame);
   //Here is the problem in my opinion. This **should** remove all related parts too.
   //But since XSLFChart is @Beta, it does not.

   //So we try doing removing the related parts manually.
   //we get the PackagePart of the chart
   PackageRelationship relship = slidepart.getRelationships().getRelationshipByID(rid);
   PackagePart chartpart = slidepart.getRelatedPart(relship);

   //now we get and remove all the relations and related PackageParts from this chartpart
   //this are /ppt/embeddings/Microsoft_Excel_WorksheetN.xlsx, /ppt/charts/colorsN.xml 
   //and /ppt/charts/styleN.xml
   for (PackageRelationship chartrelship : chartpart.getRelationships()) {
    String partname = chartrelship.getTargetURI().toString();
    PackagePart part = opcpackage.getPartsByName(Pattern.compile(partname)).get(0);
    opcpackage.removePart(part);
    chartpart.removeRelationship(chartrelship.getId());
   }
   //this works

   //now we **should** be able removing the relationship to the chartpart from the slide too
   //but this seems not to be possible
   //doing this on PackagePart level works:
   slidepart.removeRelationship(rid);
   for (PackageRelationship sliderelship : slidepart.getRelationships()) {
    System.out.println("rel PP level: " + sliderelship.getTargetURI().toString());
   }
   //all relationships to /ppt/charts/chartN.xml are removed

   //but on POIXMLDocumentPart level this has no effect
   for (POIXMLDocumentPart sliderelpart : slide.getRelations()) {
    System.out.println("rel POIXML level: " + sliderelpart.getPackagePart().getPartName());
   }
   //relationships to /ppt/charts/chartN.xml are **not** removed

   //So we cannot remove the chartpart.
   //If we would do this, then while slideShow.write the 
   //org.apache.poi.xslf.usermodel.XSLFChart.commit in XSLFChart.java fails 
   //because after removing the PackagePart is absent but the relation is still there.
   //opcpackage.removePart(chartpart);

  }


  slideShow.write(new FileOutputStream("PPTWithChartsNew.pptx"));
  slideShow.close();
 }
}

使用PowerPoint打开PPTWithChartsNew.pptx并保存后,不必要的/ppt/charts/styleN.xml部分被删除因为与他们不再有任何关系。

<小时/>

2017 年 9 月 24 日编辑:

找到了使用反射的解决方案。如前所述,删除相关图表部分需要自上而下,即从 POIXMLDocumentPart 级别向下到 PackagePart 级别。由于 POIXMLDocumentPart.removeRelation 受到保护,我们需要使用反射来完成此操作。

import java.io.FileInputStream;
import java.io.FileOutputStream;

import org.apache.poi.xslf.usermodel.*;
import org.apache.poi.sl.usermodel.*;

import org.apache.poi.POIXMLDocumentPart;

import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackagePart;
import org.apache.poi.openxml4j.opc.PackageRelationshipCollection;
import org.apache.poi.openxml4j.opc.PackageRelationship;

import org.apache.xmlbeans.XmlObject;

import java.util.Map;
import java.util.HashMap;

import java.util.regex.Pattern;

import java.lang.reflect.Method;

public class ReadPPTRemoveChart {

 public static void main(String[] args) throws Exception {

  XMLSlideShow slideShow = new XMLSlideShow(new FileInputStream("PPTWithCharts.pptx"));

  XSLFSlide slide = slideShow.getSlides().get(0);

  Map<String, XSLFGraphicFrame> chartFramesToRemove = new HashMap<>();

  for (XSLFShape shape : slide.getShapes()) {
   if (shape instanceof XSLFGraphicFrame) {
    XSLFGraphicFrame graphicframe = (XSLFGraphicFrame)shape;
    XmlObject xmlobject = graphicframe.getXmlObject();
    XmlObject[] graphics = xmlobject.selectPath(
                            "declare namespace a='http://schemas.openxmlformats.org/drawingml/2006/main' " +
                            ".//a:graphic");
    if (graphics.length > 0) { //we have a XSLFGraphicFrame containing a:graphic
     XmlObject graphic = graphics[0];
     XmlObject[] charts = graphic.selectPath(
                           "declare namespace c='http://schemas.openxmlformats.org/drawingml/2006/chart' " +
                           ".//c:chart");
     if (charts.length > 0) { //we have a XSLFGraphicFrame containing c:chart
      XmlObject chart = charts[0];
      String rid = chart.selectAttribute(
                          "http://schemas.openxmlformats.org/officeDocument/2006/relationships", "id")
                          .newCursor().getTextValue();
      chartFramesToRemove.put(rid, graphicframe);
     }
    }
   }
  }

  PackagePart slidepart = slide.getPackagePart();
  OPCPackage opcpackage = slideShow.getPackage();

  for (String rid : chartFramesToRemove.keySet()) {
   //at frist remove the XSLFGraphicFrame
   XSLFGraphicFrame chartFrame = chartFramesToRemove.get(rid);
   slide.removeShape(chartFrame);
   //Here is the problem in my opinion. This **should** remove all related parts too.
   //But since XSLFChart is @Beta, it does not.

   //So we try doing removing the related parts manually.

   //we get the PackagePart of the chart
   PackageRelationship relship = slidepart.getRelationships().getRelationshipByID(rid);
   PackagePart chartpart = slidepart.getRelatedPart(relship);

   //now we get and remove all the relations and related PackageParts from this chartpart
   //this are /ppt/embeddings/Microsoft_Excel_WorksheetN.xlsx, /ppt/charts/colorsN.xml 
   //and /ppt/charts/styleN.xml
   for (PackageRelationship chartrelship : chartpart.getRelationships()) {
    String partname = chartrelship.getTargetURI().toString();
    PackagePart part = opcpackage.getPartsByName(Pattern.compile(partname)).get(0);
    opcpackage.removePart(part);
    chartpart.removeRelationship(chartrelship.getId());
   }

   //now we remove the chart part from the slide part
   //We need doing this on POIXMLDocumentPart level. 
   //Since POIXMLDocumentPart.removeRelation is protected, we need doing this using reflection
   XSLFChart chart = (XSLFChart)slide.getRelationById(rid);
   Method removeRelation = POIXMLDocumentPart.class.getDeclaredMethod("removeRelation", POIXMLDocumentPart.class); 
   removeRelation.setAccessible(true); 
   removeRelation.invoke(slide, chart);

  }

  slideShow.write(new FileOutputStream("PPTWithChartsNew.pptx"));
  slideShow.close();
 }
}

关于java - 使用 Apache POI 从 PowerPoint 幻灯片中删除图表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46347444/

相关文章:

java - 使用 Java 检索 XML 文件中的 XSL URL 和名称

java - Apache POI 使用 HSSF 比 XSSF 快得多 - 下一步是什么?

python - 如何使用python pptx创建文本形状?

delphi - OLE自动化: How to check if a variant references an automation object

java - 来自另一个类的框架中的 JComponent repaint()

java - 确定一个值是否可用于 jOOQ 中的 Field/DataType

java - 将 While 循环更改为 for 循环

vsto - PowerPoint 2010 VSTO 问题

java - 良好的 Java 属性文件编辑器

java - 如何处理 POI 中的空数字单元格