java - 使用 JAVA 将 CSV 转换为 XML

标签 java xml csv

我有一组 CSV 数据要转换为 XML。代码看起来不错,但输出不够完美。它省略了一些列,因为它们没有值(value),并生成一长串 XML 数据而不是破坏它。

这是我的 CSV 数据示例:

Name  Age Sex
chi   23   
kay   19  male
John      male

还有我的代码:

public class XMLCreators {
  // Protected Properties
  protected DocumentBuilderFactory domFactory = null;
  protected DocumentBuilder domBuilder = null;

  public XMLCreators() {
    try {
      domFactory = DocumentBuilderFactory.newInstance();
      domBuilder = domFactory.newDocumentBuilder();
    } catch (FactoryConfigurationError exp) {
      System.err.println(exp.toString());
    } catch (ParserConfigurationException exp) {
      System.err.println(exp.toString());
    } catch (Exception exp) {
      System.err.println(exp.toString());
    }

  }

  public int convertFile(String csvFileName, String xmlFileName,
      String delimiter) {

    int rowsCount = -1;
    try {
      Document newDoc = domBuilder.newDocument();
      // Root element
      Element rootElement = newDoc.createElement("XMLCreators");
      newDoc.appendChild(rootElement);
      // Read csv file
      BufferedReader csvReader;
      csvReader = new BufferedReader(new FileReader(csvFileName));
      int fieldCount = 0;
      String[] csvFields = null;
      StringTokenizer stringTokenizer = null;

      // Assumes the first line in CSV file is column/field names
      // The column names are used to name the elements in the XML file,
      // avoid the use of Space or other characters not suitable for XML element
      // naming

      String curLine = csvReader.readLine();
      if (curLine != null) {
        // how about other form of csv files?
        stringTokenizer = new StringTokenizer(curLine, delimiter);
        fieldCount = stringTokenizer.countTokens();
        if (fieldCount > 0) {
          csvFields = new String[fieldCount];
          int i = 0;
          while (stringTokenizer.hasMoreElements())
            csvFields[i++] = String.valueOf(stringTokenizer.nextElement());
        }
      }

      // At this point the coulmns are known, now read data by lines
      while ((curLine = csvReader.readLine()) != null) {
        stringTokenizer = new StringTokenizer(curLine, delimiter);
        fieldCount = stringTokenizer.countTokens();
        if (fieldCount > 0) {
          Element rowElement = newDoc.createElement("row");
          int i = 0;
          while (stringTokenizer.hasMoreElements()) {
            try {
              String curValue = String.valueOf(stringTokenizer.nextElement());
              Element curElement = newDoc.createElement(csvFields[i++]);
              curElement.appendChild(newDoc.createTextNode(curValue));
              rowElement.appendChild(curElement);
            } catch (Exception exp) {
            }
          }
          rootElement.appendChild(rowElement);
          rowsCount++;
        }
      }
      csvReader.close();

      // Save the document to the disk file
      TransformerFactory tranFactory = TransformerFactory.newInstance();
      Transformer aTransformer = tranFactory.newTransformer();
      Source src = new DOMSource(newDoc);
      Result result = new StreamResult(new File(xmlFileName));
      aTransformer.transform(src, result);
      rowsCount++;

      // Output to console for testing
      // Resultt result = new StreamResult(System.out);

    } catch (IOException exp) {
      System.err.println(exp.toString());
    } catch (Exception exp) {
      System.err.println(exp.toString());
    }
    return rowsCount;
    // "XLM Document has been created" + rowsCount;
  }
}

当对上述数据执行这段代码时,它会产生:

<?xml version="1.0" encoding="UTF-8"?>
<XMLCreators>
<row>
<Name>chi</Name>
<Age>23</Age>
</row>
<row>
<Name>kay</Name>
<Age>19</Age>
<sex>male</sex>
</row>
<row>
<Name>john</Name>
<Age>male</Age>
</row>
</XMLCreators>

我自己将其排列成这种形式,但输出产生了很长的一行。要产生的输出应该是:

<?xml version="1.0" encoding="UTF-8"?>
<XMLCreators>
<row>
<Name>chi</Name>
<Age>23</Age>
<sex></sex>
</row>
<row>
<Name>kay</Name>
<Age>19</Age>
<sex>male</sex>
</row>
<row>
<Name>john</Name>
<Age></Age>
<sex>male</sex>
 </row>
 </XMLCreators>

最佳答案

我同意 Kennet 的观点。

我只是加了

aTransformer .setOutputProperty(OutputKeys.INDENT, "yes");
aTransformer .setOutputProperty(OutputKeys.METHOD, "xml");
aTransformer .setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

这在元素之间添加了一个新行并允许缩进。

已更新

让我们从您呈现的文件不是 CSV(逗号分隔值)文件这一事实开始,我会让您担心这个问题...

List<String> headers = new ArrayList<String>(5);

File file = new File("Names2.csv");
BufferedReader reader = null;

try {

    DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder domBuilder = domFactory.newDocumentBuilder();

    Document newDoc = domBuilder.newDocument();
    // Root element
    Element rootElement = newDoc.createElement("XMLCreators");
    newDoc.appendChild(rootElement);

    reader = new BufferedReader(new FileReader(file));
    int line = 0;

    String text = null;
    while ((text = reader.readLine()) != null) {

        StringTokenizer st = new StringTokenizer(text, " ", false);    
        String[] rowValues = new String[st.countTokens()];
        int index = 0;
        while (st.hasMoreTokens()) {
            
            String next = st.nextToken();
            rowValues[index++] = next;
            
        }
    
        //String[] rowValues = text.split(",");

        if (line == 0) { // Header row
            for (String col : rowValues) {
                headers.add(col);
            }
        } else { // Data row
            Element rowElement = newDoc.createElement("row");
            rootElement.appendChild(rowElement);
            for (int col = 0; col < headers.size(); col++) {
                String header = headers.get(col);
                String value = null;

                if (col < rowValues.length) {
                    value = rowValues[col];
                } else {
                    // ?? Default value
                    value = "";
                }

                Element curElement = newDoc.createElement(header);
                curElement.appendChild(newDoc.createTextNode(value));
                rowElement.appendChild(curElement);
            }
        }
        line++;
    }

    ByteArrayOutputStream baos = null;
    OutputStreamWriter osw = null;

    try {
        baos = new ByteArrayOutputStream();
        osw = new OutputStreamWriter(baos);

        TransformerFactory tranFactory = TransformerFactory.newInstance();
        Transformer aTransformer = tranFactory.newTransformer();
        aTransformer.setOutputProperty(OutputKeys.INDENT, "yes");
        aTransformer.setOutputProperty(OutputKeys.METHOD, "xml");
        aTransformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

        Source src = new DOMSource(newDoc);
        Result result = new StreamResult(osw);
        aTransformer.transform(src, result);

        osw.flush();
        System.out.println(new String(baos.toByteArray()));
    } catch (Exception exp) {
        exp.printStackTrace();
    } finally {
        try {
            osw.close();
        } catch (Exception e) {
        }
        try {
            baos.close();
        } catch (Exception e) {
        }
    }
} catch (Exception e) {
    e.printStackTrace();
}

现在我在这里使用了 List 而不是 Map。您需要决定如何最好地解决缺失值问题。如果事先不知道文件的结构,这将不是一个简单的解决方案。

无论如何,我最终得到了

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<XMLCreators>
    <row>
        <Name>chi</Name>
        <Age>23</Age>
        <Sex/>
    </row>
    <row>
        <Name>kay</Name>
        <Age>19</Age>
        <Sex>male</Sex>
    </row>
    <row>
        <Name>John</Name>
        <Age>male</Age>
        <Sex/>
    </row>
</XMLCreators>

合并更新

public class XMLCreators {
    // Protected Properties

    protected DocumentBuilderFactory domFactory = null;
    protected DocumentBuilder domBuilder = null;

    public XMLCreators() {
        try {
            domFactory = DocumentBuilderFactory.newInstance();
            domBuilder = domFactory.newDocumentBuilder();
        } catch (FactoryConfigurationError exp) {
            System.err.println(exp.toString());
        } catch (ParserConfigurationException exp) {
            System.err.println(exp.toString());
        } catch (Exception exp) {
            System.err.println(exp.toString());
        }

    }

    public int convertFile(String csvFileName, String xmlFileName,
                    String delimiter) {

        int rowsCount = -1;
        try {
            Document newDoc = domBuilder.newDocument();
            // Root element
            Element rootElement = newDoc.createElement("XMLCreators");
            newDoc.appendChild(rootElement);
            // Read csv file
            BufferedReader csvReader;
            csvReader = new BufferedReader(new FileReader(csvFileName));

//                int fieldCount = 0;
//                String[] csvFields = null;
//                StringTokenizer stringTokenizer = null;
//
//                // Assumes the first line in CSV file is column/field names
//                // The column names are used to name the elements in the XML file,
//                // avoid the use of Space or other characters not suitable for XML element
//                // naming
//
//                String curLine = csvReader.readLine();
//                if (curLine != null) {
//                    // how about other form of csv files?
//                    stringTokenizer = new StringTokenizer(curLine, delimiter);
//                    fieldCount = stringTokenizer.countTokens();
//                    if (fieldCount > 0) {
//                        csvFields = new String[fieldCount];
//                        int i = 0;
//                        while (stringTokenizer.hasMoreElements()) {
//                            csvFields[i++] = String.valueOf(stringTokenizer.nextElement());
//                        }
//                    }
//                }
//
//                // At this point the coulmns are known, now read data by lines
//                while ((curLine = csvReader.readLine()) != null) {
//                    stringTokenizer = new StringTokenizer(curLine, delimiter);
//                    fieldCount = stringTokenizer.countTokens();
//                    if (fieldCount > 0) {
//                        Element rowElement = newDoc.createElement("row");
//                        int i = 0;
//                        while (stringTokenizer.hasMoreElements()) {
//                            try {
//                                String curValue = String.valueOf(stringTokenizer.nextElement());
//                                Element curElement = newDoc.createElement(csvFields[i++]);
//                                curElement.appendChild(newDoc.createTextNode(curValue));
//                                rowElement.appendChild(curElement);
//                            } catch (Exception exp) {
//                            }
//                        }
//                        rootElement.appendChild(rowElement);
//                        rowsCount++;
//                    }
//                }
//                csvReader.close();
//
//                // Save the document to the disk file
//                TransformerFactory tranFactory = TransformerFactory.newInstance();
//                Transformer aTransformer = tranFactory.newTransformer();
//                Source src = new DOMSource(newDoc);
//                Result result = new StreamResult(new File(xmlFileName));
//                aTransformer.transform(src, result);
//                rowsCount++;
            int line = 0;
            List<String> headers = new ArrayList<String>(5);

            String text = null;
            while ((text = csvReader.readLine()) != null) {

                StringTokenizer st = new StringTokenizer(text, delimiter, false);
                String[] rowValues = new String[st.countTokens()];
                int index = 0;
                while (st.hasMoreTokens()) {

                    String next = st.nextToken();
                    rowValues[index++] = next;

                }

                if (line == 0) { // Header row

                    for (String col : rowValues) {
                        headers.add(col);
                    }

                } else { // Data row
                    
                    rowsCount++;

                    Element rowElement = newDoc.createElement("row");
                    rootElement.appendChild(rowElement);
                    for (int col = 0; col < headers.size(); col++) {

                        String header = headers.get(col);
                        String value = null;

                        if (col < rowValues.length) {

                            value = rowValues[col];

                        } else {
                            // ?? Default value
                            value = "";
                        }

                        Element curElement = newDoc.createElement(header);
                        curElement.appendChild(newDoc.createTextNode(value));
                        rowElement.appendChild(curElement);

                    }

                }
                line++;

            }

            ByteArrayOutputStream baos = null;
            OutputStreamWriter osw = null;

            try {

                baos = new ByteArrayOutputStream();
                osw = new OutputStreamWriter(baos);

                TransformerFactory tranFactory = TransformerFactory.newInstance();
                Transformer aTransformer = tranFactory.newTransformer();
                aTransformer.setOutputProperty(OutputKeys.INDENT, "yes");
                aTransformer.setOutputProperty(OutputKeys.METHOD, "xml");
                aTransformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

                Source src = new DOMSource(newDoc);
                Result result = new StreamResult(osw);
                aTransformer.transform(src, result);

                osw.flush();
                System.out.println(new String(baos.toByteArray()));

            } catch (Exception exp) {
                exp.printStackTrace();
            } finally {
                try {
                    osw.close();
                } catch (Exception e) {
                }
                try {
                    baos.close();
                } catch (Exception e) {
                }
            }

            // Output to console for testing
            // Resultt result = new StreamResult(System.out);

        } catch (IOException exp) {
            System.err.println(exp.toString());
        } catch (Exception exp) {
            System.err.println(exp.toString());
        }
        return rowsCount;
        // "XLM Document has been created" + rowsCount;
    }
}

使用 OpenCSV 更新

public class XMLCreators {
    // Protected Properties

    protected DocumentBuilderFactory domFactory = null;
    protected DocumentBuilder domBuilder = null;

    public XMLCreators() {
        try {
            domFactory = DocumentBuilderFactory.newInstance();
            domBuilder = domFactory.newDocumentBuilder();
        } catch (FactoryConfigurationError exp) {
            System.err.println(exp.toString());
        } catch (ParserConfigurationException exp) {
            System.err.println(exp.toString());
        } catch (Exception exp) {
            System.err.println(exp.toString());
        }

    }

    public int convertFile(String csvFileName, String xmlFileName,
                    String delimiter) {

        int rowsCount = -1;
        BufferedReader csvReader;
        try {
            Document newDoc = domBuilder.newDocument();
            // Root element
            Element rootElement = newDoc.createElement("XMLCreators");
            newDoc.appendChild(rootElement);
            // Read csv file
            csvReader = new BufferedReader(new FileReader(csvFileName));

            //** Now using the OpenCSV **//
            CSVReader reader = new CSVReader(new FileReader("names.csv"), delimiter.charAt(0));
            //CSVReader reader = new CSVReader(csvReader);
            String[] nextLine;
            int line = 0;
            List<String> headers = new ArrayList<String>(5);
            while ((nextLine = reader.readNext()) != null) {

                if (line == 0) { // Header row
                    for (String col : nextLine) {
                        headers.add(col);
                    }
                } else { // Data row
                    Element rowElement = newDoc.createElement("row");
                    rootElement.appendChild(rowElement);

                    int col = 0;
                    for (String value : nextLine) {
                        String header = headers.get(col);

                        Element curElement = newDoc.createElement(header);
                        curElement.appendChild(newDoc.createTextNode(value.trim()));
                        rowElement.appendChild(curElement);

                        col++;
                    }
                }
                line++;
            }
            //** End of CSV parsing**//

            FileWriter writer = null;
            
            try {
            
                writer = new FileWriter(new File(xmlFileName));

                TransformerFactory tranFactory = TransformerFactory.newInstance();
                Transformer aTransformer = tranFactory.newTransformer();
                aTransformer.setOutputProperty(OutputKeys.INDENT, "yes");
                aTransformer.setOutputProperty(OutputKeys.METHOD, "xml");
                aTransformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

                Source src = new DOMSource(newDoc);
                Result result = new StreamResult(writer);
                aTransformer.transform(src, result);
                
                writer.flush();

            } catch (Exception exp) {
                exp.printStackTrace();
            } finally {
                try {
                    writer.close();
                } catch (Exception e) {
                }
            }

            // Output to console for testing
            // Resultt result = new StreamResult(System.out);

        } catch (IOException exp) {
            System.err.println(exp.toString());
        } catch (Exception exp) {
            System.err.println(exp.toString());
        }
        return rowsCount;
        // "XLM Document has been created" + rowsCount;
    }
}

下次更新 (2022)

  • 添加了对 OpenCSV 5 的支持。*
  • 替换标题中的空格(至少我能想到的能符合正则表达式的空格)

因此,例如,使用类似...

Reports Device Name IP Address  Interface Name  Interface   Description Time    Traffic Utilization Out Traffic bps 
Device1 1.1.1.1 5-Apr   Mon May 09 23:00:00 UTC 2022    0   0   0
Device2 1.1.1.1 5-Apr   Mon May 09 23:00:00 UTC 2022    0   0   0

它将生成类似于...的输出

<XMLCreators>
    <row>
        <Reports>1</Reports>
        <Device_Name>2</Device_Name>
        <IP_Address>3</IP_Address>
        <Interface_Name>4</Interface_Name>
        <Interface>5</Interface>
        <Description>6</Description>
        <Time>7</Time>
        <Traffic>8</Traffic>
        <Utilization>9</Utilization>
        <Out_Traffic>10</Out_Traffic>
        <bps_>11</bps_>
    </row>
    <row>
        <Reports>Device1</Reports>
        <Device_Name>1.1.1.1</Device_Name>
        <IP_Address>5-Apr</IP_Address>
        <Interface_Name>Mon May 09 23:00:00 UTC 2022</Interface_Name>
        <Interface>0</Interface>
        <Description>0</Description>
        <Time>0</Time>
    </row>
    <row>
        <Reports>Device2</Reports>
        <Device_Name>1.1.1.1</Device_Name>
        <IP_Address>5-Apr</IP_Address>
        <Interface_Name>Mon May 09 23:00:00 UTC 2022</Interface_Name>
        <Interface>0</Interface>
        <Description>0</Description>
        <Time>0</Time>
    </row>
</XMLCreators>

可运行的例子

import com.opencsv.CSVParser;
import com.opencsv.CSVParserBuilder;
import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.FactoryConfigurationError;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;

public class Main {
    public static void main(String[] args) {
        new Main();
    }

    public Main() {
        new XMLCreators().convertFile("Test.csv", "Test.xml", '\t');
    }

    public class XMLCreators {
        // Protected Properties

        protected DocumentBuilderFactory domFactory = null;
        protected DocumentBuilder domBuilder = null;

        public XMLCreators() {
            try {
                domFactory = DocumentBuilderFactory.newInstance();
                domBuilder = domFactory.newDocumentBuilder();
            } catch (FactoryConfigurationError exp) {
                System.err.println(exp.toString());
            } catch (ParserConfigurationException exp) {
                System.err.println(exp.toString());
            } catch (Exception exp) {
                System.err.println(exp.toString());
            }
        }

        public int convertFile(String csvFileName, String xmlFileName, char delimiter) {

            int rowsCount = -1;
            BufferedReader csvReader;
            try {
                Document newDoc = domBuilder.newDocument();
                // Root element
                Element rootElement = newDoc.createElement("XMLCreators");
                newDoc.appendChild(rootElement);
                // Read csv file
                csvReader = new BufferedReader(new FileReader(csvFileName));

                //** Now using the OpenCSV **//
                CSVParser parser = new CSVParserBuilder()
                        .withSeparator(delimiter)
                        .build();

                CSVReader reader = new CSVReaderBuilder(new FileReader(csvFileName))
                        .withCSVParser(parser)
                        .build();
                //CSVReader reader = new CSVReader(csvReader);
                String[] nextLine;
                int line = 0;
                List<String> headers = new ArrayList<String>(5);
                while ((nextLine = reader.readNext()) != null) {
                    if (line == 0) { // Header row
                        for (String col : nextLine) {
                            headers.add(col);
                        }
                    } else { // Data row
                        Element rowElement = newDoc.createElement("row");
                        rootElement.appendChild(rowElement);

                        int col = 0;
                        for (String value : nextLine) {
                            String header = headers.get(col).replaceAll("[\\t\\p{Zs}\\u0020]", "_");

                            Element curElement = newDoc.createElement(header);
                            curElement.appendChild(newDoc.createTextNode(value.trim()));
                            rowElement.appendChild(curElement);

                            col++;
                        }
                    }
                    line++;
                }
                //** End of CSV parsing**//

                FileWriter writer = null;

                try {

                    writer = new FileWriter(new File(xmlFileName));

                    TransformerFactory tranFactory = TransformerFactory.newInstance();
                    Transformer aTransformer = tranFactory.newTransformer();
                    aTransformer.setOutputProperty(OutputKeys.INDENT, "yes");
                    aTransformer.setOutputProperty(OutputKeys.METHOD, "xml");
                    aTransformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

                    Source src = new DOMSource(newDoc);
                    Result result = new StreamResult(writer);
                    aTransformer.transform(src, result);

                    writer.flush();

                } catch (Exception exp) {
                    exp.printStackTrace();
                } finally {
                    try {
                        writer.close();
                    } catch (Exception e) {
                    }
                }

                // Output to console for testing
                // Resultt result = new StreamResult(System.out);
            } catch (IOException exp) {
                exp.printStackTrace();
            } catch (Exception exp) {
                exp.printStackTrace();
            }
            return rowsCount;
            // "XLM Document has been created" + rowsCount;
        }
    }
}

关于java - 使用 JAVA 将 CSV 转换为 XML,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12120055/

相关文章:

ruby - 将 CSV 导入 Postgresql,其中包含不重复行的重复值

python - 使用 petl 将字符串转换为元组

java - 建立 AWS 连接时 Android 中出现 ClassDefNotFound 错误

基于文本和基于控制台的表单的 Java 框架?

java - 在非 Activity 课上使用 Volley 和 Klaxon

javascript - 如何定位嵌套 XML 数据以更新 firebase

Python CSV 问题

java - 如何模拟在 G-Suite 域中运行的 AppEngine java 应用程序的用户?

html - 在 <span> 标签中嵌入 XSL 代码

java - Spring 3 : Inject Default Bean Unless Another Bean Present