java - 读取 .tsv 文件时跳过备用行

标签 java opencsv csv supercsv

我有一个 .tsv 文件,有 39 列 最后一列的数据为字符串,长度超过 100,000 个字符 现在发生的事情是当我尝试读取文件第 1 行有标题时,然后是数据

发生的事情是在读取第 1 行之后,它转到第 3 行,然后是第 5 行,然后是第 7 行 尽管所有行都有相同的数据 按照我得到的日志

lineNo=3, rowNo=2, customer=503837-100 , last but one cell length=111275
lineNo=5, rowNo=3, customer=503837-100 , last but one cell length=111275
lineNo=7, rowNo=4, customer=503837-100 , last but one cell length=111275
lineNo=9, rowNo=5, customer=503837-100 , last but one cell length=111275
lineNo=11, rowNo=6, customer=503837-100 , last but one cell length=111275
lineNo=13, rowNo=7, customer=503837-100 , last but one cell length=111275
lineNo=15, rowNo=8, customer=503837-100 , last but one cell length=111275
lineNo=17, rowNo=9, customer=503837-100 , last but one cell length=111275
lineNo=19, rowNo=10, customer=503837-100 , last but one cell length=111275

以下是我的代码:

import java.io.FileReader;
import org.supercsv.cellprocessor.Optional;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvBeanReader;
import org.supercsv.io.ICsvBeanReader;
import org.supercsv.prefs.CsvPreference;

public class readWithCsvBeanReader {
    public static void main(String[] args) throws Exception{
        readWithCsvBeanReader();
    }


private static void readWithCsvBeanReader() throws Exception {

    ICsvBeanReader beanReader = null;

    try {

        beanReader = new CsvBeanReader(new FileReader("C:\MAP TSV\abc.tsv"), CsvPreference.TAB_PREFERENCE);
        // the header elements are used to map the values to the bean (names must match)
        final String[] header = beanReader.getHeader(true);
        final CellProcessor[] processors = getProcessors();
        TSVReaderBrandDTO tsvReaderBrandDTO = new TSVReaderBrandDTO();

        int i = 0;
        int last = 0;

        while( (tsvReaderBrandDTO = beanReader.read(TSVReaderBrandDTO.class, header, processors)) != null ) {
            if(null == tsvReaderBrandDTO.getPage_cache()){
                last = 0;
            }
            else{
                last = tsvReaderBrandDTO.getPage_cache().length();
            }
            System.out.println(String.format("lineNo=%s, rowNo=%s, customer=%s , last but one cell length=%s", beanReader.getLineNumber(),
                beanReader.getRowNumber(), tsvReaderBrandDTO.getUnique_ID(), last));
            i++;
        }

        System.out.println("Number of rows : "+i);

    }
    finally {
        if( beanReader != null ) {
            beanReader.close();
        }
    }
}

private static CellProcessor[] getProcessors() {

    final CellProcessor[] processors = new CellProcessor[] { 
         new Optional(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
         new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
         new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
         new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
         new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
         new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
         new NotNull(), new NotNull(), new NotNull(), new NotNull(), new NotNull(),
         new NotNull(), new NotNull(), new NotNull(), new Optional()};

        return processors;
    }
}

请告诉我哪里出错了

最佳答案

如果您使用 CSV 解析器来解析 TSV 输入,您将会遇到麻烦。 使用正确的 TSV 解析器。 uniVocity-parsers配有 TSV 解析器/编写器。您还可以使用带注释的 java beans 将文件直接解析为类的实例。

示例:

此代码将 TSV 解析为行。

TsvParserSettings settings = new TsvParserSettings();

// creates a TSV parser
TsvParser parser = new TsvParser(settings);

// parses all rows in one go.
List<String[]> allRows = parser.parseAll(new FileReader(yourFile));

使用 BeanListProcessor 解析为 java beans:

BeanListProcessor<TestBean> rowProcessor = new BeanListProcessor<TestBean>(TestBean.class);

TsvParserSettings parserSettings = new TsvParserSettings();
parserSettings.setRowProcessor(rowProcessor);

TsvParser parser = new TsvParser(parserSettings);
parser.parse(new FileReader(yourFile));

// The BeanListProcessor provides a list of objects extracted from the input.
List<TestBean> beans = rowProcessor.getBeans();

TestBean 类如下所示: 类 TestBean {

// if the value parsed in the quantity column is "?" or "-", it will be replaced by null.
@NullString(nulls = { "?", "-" })
// if a value resolves to null, it will be converted to the String "0".
@Parsed(defaultNullRead = "0")
private Integer quantity;


@Trim
@LowerCase
@Parsed(index = 4)
private String comments;

// you can also explicitly give the name of a column in the file.
@Parsed(field = "amount")
private BigDecimal amount;

@Trim
@LowerCase
// values "no", "n" and "null" will be converted to false; values "yes" and "y" will be converted to true
@BooleanString(falseStrings = { "no", "n", "null" }, trueStrings = { "yes", "y" })
@Parsed
private Boolean pending;

披露:我是这个库的作者。它是开源且免费的(Apache V2.0 许可证)。

关于java - 读取 .tsv 文件时跳过备用行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21180552/

相关文章:

python - 如果 CSV 文件不存在,如何创建它,然后仅 append 到它的 Python

python - 如何迭代 Pandas.DataFrame 中的列并将函数的结果附加到同一行?

java - 如何从 jenkins 的现有项目生成 .hpi 插件

java - 如何阻止 JDBC 创建数据库?

csv - OpenCsv 读取带有转义分隔符的文件

java - 带有列标题的 StatefulBeanToCsv

java - 业务规则验证的框架/设计模式

java - jsoup - 去除所有格式和链接标签,只保留文本

java - 哪个 opencsv 版本与 Java 6 兼容

java - 获取 csv 并比较行。数组列表? java