c++ - 如何从大文本文件中选择行数?

标签 c++ algorithm

我想知道如何从某个文本文件中选择行数。例如:我有一个包含以下行的文本文件:

branch 27 : rect id 23400
rect:   -115.475609 -115.474907
    31.393650   31.411301
branch 28 : rect id 23398
rect:   -115.474907 -115.472282
    31.411301   31.417351
branch 29 : rect id 23396
rect:   -115.472282 -115.468033
    31.417351   31.427151
branch 30 : rect id 23394
rect:   -115.468033 -115.458733
    31.427151   31.438181
Non-Leaf Node:  level=1  count=31  address=53
branch 0 : rect id 42
rect:   -115.768539 -106.251556
    31.425039   31.717550
branch 1 : rect id 50
rect:   -109.559479 -106.009361
    31.296721   31.775299
branch 2 : rect id 51
rect:   -110.937401 -106.226143
    31.285870   31.771971
branch 3 : rect id 54
rect:   -109.584412 -106.069092
    31.285240   31.775230
branch 4 : rect id 56
rect:   -109.570961 -106.000954
    31.296721   31.780769
branch 5 : rect id 58
rect:   -115.806213 -106.366188
    31.400450   31.687519
branch 6 : rect id 59
rect:   -113.173859 -106.244057
    31.297440   31.627750
branch 7 : rect id 60
rect:   -115.811478 -106.278252
    31.400450   31.679470
branch 8 : rect id 61
rect:   -109.953888 -106.020111
    31.325319   31.775270
branch 9 : rect id 64
rect:   -113.070969 -106.015968
    31.331841   31.704750
branch 10 : rect id 68
rect:   -113.065689 -107.034576
    31.326300   31.770809
branch 11 : rect id 71
rect:   -112.333344 -106.059860
    31.284081   31.662920
branch 12 : rect id 73
rect:   -115.071083 -106.309677
    31.267879   31.466850
branch 13 : rect id 74
rect:   -116.094414 -106.286308
    31.236290   31.424770
branch 14 : rect id 75
rect:   -115.423264 -106.286308
    31.229691   31.415510
branch 15 : rect id 76
rect:   -116.111656 -106.313110
    31.259390   31.478300
branch 16 : rect id 77
rect:   -116.247467 -106.309677
    31.240231   31.451799
branch 17 : rect id 78
rect:   -116.170792 -106.094543
    31.156429   31.391781
branch 18 : rect id 79
rect:   -116.225723 -106.292709
    31.239960   31.442850
branch 19 : rect id 80
rect:   -116.268013 -105.769913
    31.157240   31.378111
branch 20 : rect id 82
rect:   -116.215424 -105.827202
    31.198441   31.383421
branch 21 : rect id 83
rect:   -116.095734 -105.826439
    31.197460   31.373819
branch 22 : rect id 84
rect:   -115.423264 -105.815018
    31.182640   31.368891
branch 23 : rect id 85
rect:   -116.221527 -105.776512
    31.160931   31.389830
branch 24 : rect id 86
rect:   -116.203369 -106.473831
    31.168350   31.367611
branch 25 : rect id 87
rect:   -115.727631 -106.501587
    31.189100   31.395941
branch 26 : rect id 88
rect:   -116.237289 -105.790756
    31.164780   31.358959
branch 27 : rect id 89
rect:   -115.791344 -105.990044
    31.072620   31.349529
branch 28 : rect id 90
rect:   -115.736847 -106.495079
    31.187969   31.376900
branch 29 : rect id 91
rect:   -115.721710 -106.000130
    31.160351   31.354601
branch 30 : rect id 92
rect:   -115.792236 -106.000793
    31.166620   31.378811
Leaf Node:  level=0  count=21  address=42
branch 0 : rect id 18312
rect:   -106.412270 -106.401367
    31.704750   31.717550
branch 1 : rect id 18288
rect:   -106.278252 -106.253387
    31.520321   31.548361

我只想要非叶节点级别=1 到叶节点级别=0 之间的那些线,还有很多这样的段,我需要它们。

最佳答案

最简单的方法是将尽可能多的文件读入内存,然后扫描请求标记。复制或处理所有数据,直到找到终止 token 。某些平台具有为您将文件拖入内存的功能,例如mmap() ,尽管这不是标准语言。

如果文件没有变化,您可以保存标记行的偏移量。

如果你真的需要按行号索引,那么创建一个std::map<line number, offset>多变的。逐行读取文件并在读取文件时存储行号和偏移量。

关于c++ - 如何从大文本文件中选择行数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11283045/

相关文章:

c++ - 简单的基于 OpenGL 的相机类

c++ - 编译器是否必须评估表达式是否取决于模板参数?

javascript - 等分值算法

java - 快速排序可视化?

c# - 时间如何排序

c++ - 当以有序方式添加项目时,二叉搜索树有效地充当链表

c++ - 为什么在 Eclipse 中出现 "unrecognised emulation mode: 32"错误?

c++ - 使用 gcc 编译时 PowerPC 上的 long double 错误

c++ - 在编译指向可能抛出的函数的非抛出指向时,gcc 有什么问题吗?

python - YIN算法到python寻找基频