python - 如何从固定行数中计算具有特定单词的行号？

通过使用下面的代码，我只能计算 FBC 单词的数量和/或那里有多少 FBC。但是，我想从固定行数中计算一个带有特定单词的行号

    def lcount(keyword, fname):
    with open(fname, 'r') as fin:
        return sum([1 for line in fin if keyword in line])
    F=lcount('FBC', 'BLK100-199C1-J-1000-K-10.txt');
    print (F)

下面是我想从文本文件中读取的数据:

-----------------------------------------------------------------------------
`PagesPerBlock= 64                      
Block = 100                     
Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=0,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=2,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=3,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=4,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=11,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=20,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=32,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=45,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=54,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=71,

PagesPerBlock= 64                       
Block = 101                     
Read time= 690, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=0,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=0,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=3,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=7,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=11,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=15,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=24

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=34,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=42,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=50,`

首先，我想先阅读 10 FBC-word containing the line。其中，我想计算包含第一个非零 FBC 的行号。并且，对下一个 10 个包含该行的 FBC 词 重复该过程。

根据给定的数据和我的查询，答案应该是 - 2, 3 因为第 2 行包含第一个 10 个包含 FBC 字的行 的第一个非零 FBC，而第 3 行包含最后一个 10 个包含 FBC 字的行 的第一个非零 FBC

不幸的是，我不知道如何使用 Python 来做到这一点。请帮我解决这个问题。

最佳答案

很难理解问题是什么。简化您的输入，例如:

header1
header2
some_data_n   FBC=0,
some_data_b   FBC=1,
some_data_v   FBC=2,
some_data_c   FBC=0,
some_data_x   FBC=3,

然后写下你想要得到什么输出？

[编辑:] 所以你应该阅读所有行，只用 FBC 语句提取这些行，然后找到包含 FBC 但不包含 FBC=0 的第一行的索引

def line_index(keyword, fname):
    with open(fname, 'r') as fin:
        # get all lines from file
        lines = fin.readlines()
        # get only lines with keyword
        lines = [ln for ln in lines if keyword in ln]
        # check where the keyword has value 0
        zero_value_str = "%s=%d" % (keyword, 0)
        presence = [zero_value_str in ln for ln in lines]

        # The first element where 0-valued FBC is not present
        index1 = presence.index(False)
        # Now we don't need this element so we switch the value for this index
        presence[index1] = True
        # now we search for the second
        index2 = presence.index(False)

        # We want to numerate indexes starting from 1, not 0, so increment them
        return index1 + 1, index2 + 1

F=line_index('FBC', 'BLK100-199C1-J-1000-K-10.txt');
print (F)

你可以很容易地对 bool 列表进行操作以找到另一个索引

请注意，这些索引的值是从 0 开始的，因此第二个的索引为 1

关于python - 如何从固定行数中计算具有特定单词的行号？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47972665/

python - 如何从固定行数中计算具有特定单词的行号？

上一篇：python - 使用手套中的训练数据为您的数据集获取词嵌入

下一篇：python - SSLError : HTTPSConnectionPool(host ='www.quandl.com' , 端口=443):超过最大重试次数