java - 如何在java中读取文件时标记位置？

我用 java 编写了一个解析器，用于解析文本文件中的多个特征。其想法是获取与相应标题相对应的行 block 。

例如，如果我有这个:

CC   -!- FUNCTION: Adapter protein implicated in the regulation of a large
CC       spectrum of both general and specialized signaling pathways. Binds ...

我需要得到这个:

Function :  Adapter protein implicated in the regulation of a large spectrum of both general and specialized signaling pathways. Binds ....

对于该类型文本文件的所有功能，我可以毫无问题地做到这一点。

当我遇到这个问题时，问题就出现了:

CC   -!- FUNCTION: Adapter protein implicated in the regulation of a large
CC       spectrum of both general and specialized signaling pathway ...
CC   -!- SUBUNIT: Homodimer. Interacts with SAMSN1 and PRKCE (By
CC       similarity). Interacts with SSH1 and TORC2/CRTC2. Interacts ..

当我完成“function” block 时，我的解析器总是会在末尾跳一行并转义，因此我无法再得到带有“SUBUNIT”的行:(

这是我需要解析的文件的示例:

    CC   -!- FUNCTION: Adapter protein implicated in the regulation of a large
    CC       spectrum of both general and specialized signaling pathways. Binds...
    CC   -!- SUBUNIT: Homodimer. Interacts with SAMSN1 and PRKCE (By
    CC       similarity). Interacts with SSH1 and TORC2/CRTC2. Interacts with ...
    CC   -!- SUBUNIT: Homodimer. Interacts with SAMSN1 and PRKCE salut(By
    CC       similarity). Interacts with SSH1 and TORC2/CRTC2. salutInteracts with
    CC   -!- INTERACTION:
    CC       Q76353:- (xeno); NbExp=3; IntAct=EBI-359815, EBI-6248077;
    CC       Q9P0K1-3:ADAM22; NbExp=2; IntAct=EBI-359815, EBI-1567267; ...
    CC   -!- SUBCELLULAR LOCATION: Cytoplasm. Melanosome. Note=Identified by
    CC       mass spectrometry in melanosome fractions from stage I to stage
    CC       IV. ....

这是我写的一部分。我试图在读取文件时标记文件中的当前位置，但是当我这样做时解析效果不佳。我在这里缺少什么？

为任何帮助干杯，我们将不胜感激:)

            // Function
        if (line.startsWith("CC   -!- FUNCTION")) {
            String data[] = line.split("CC   -!- FUNCTION:");
            function = function + data[1];
            while ((line = bReader.readLine()) != null && ( (line.startsWith("CC       ")) || (line.startsWith("CC   -!- FUNCTION")) ) ) {
                if (line.startsWith("CC       ")) {
                    String dataOther[] = line.split("CC      ");
                    function = function + dataOther[1];
                    prot.setFunction(function);
                    bReader.mark(size);

                }

                else if (line.startsWith("CC   -!- FUNCTION")) {
                    String dataOther[] = line.split("CC   -!- FUNCTION:");
                    function = function + "-!-"+ dataOther[1];
                    prot.setFunction(function);
                    bReader.mark(size);

                }
            }



            bReader.reset();
        }   


        // Subunit
        if (line.startsWith("CC   -!- SUBUNIT")) {
            String data[] = line.split("CC   -!- SUBUNIT:");
            subunit = subunit  + "-|-"+ data[1];
            while ((line = bReader.readLine()) != null && ( (line.startsWith("CC       "))  ) ) {
                if (line.startsWith("CC       ")) {
                    String dataOther[] = line.split("CC      ");
                    subunit  = subunit  + dataOther[1];
                    prot.setSubunit(subunit);


                }


            }

            //bReader.reset();
        }

最佳答案

.mark() 和 .reset() 用于从缓冲区读取的更高级技术。我认为在你的情况下，你只需要重新学习从文件中读取数据。我在你的代码中看到你有多个 bReader.readLine();将从缓冲区读取一行并每次丢弃它，因此通常您只想 .readLine 一次，然后处理它。

BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line = br.readLine()) != null) {
   if (line.startsWith("CC   -!- FUNCTION")) {
      String line2 = br.readLine();
      //do some stuff
   }
   if (line.startsWith("CC   -!- SUBUNIT")) {
      String line2 = br.readLine();
      //do some stuff
   }
}
br.close();

我对你的理解正确吗？

关于java - 如何在java中读取文件时标记位置？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/22514695/

java - 如何在java中读取文件时标记位置？

上一篇：java - 填充自定义对象的数组将另一个对象的字段修改到数组中

下一篇：java - 混合应用程序和小程序