我对 Python 还很陌生。我想在下面的代码中找到 Python 关键字 ['def','in', 'if'...]
的出现次数。但是,需要忽略代码中任何字符串常量中找到的关键字。
如何计算关键字出现次数而不计算字符串中的关键字?
def grade(result):
'''
if if (<--- example to test if the word "if" will be ignored in the counts)
:param result: none
:return:none
'''
if result >= 80:
grade = "HD"
elif 70 <= result:
grade = "DI"
elif 60 <= result:
grade = "CR"
elif 50 <= result:
grade = "PA"
else:
#else (ignore this word)
grade = "NN"
return grade
result = float(raw_input("Enter a final result: "))
while result < 0 or result > 100:
print "Invalid result. Result must be between 0 and 100."
result = float(raw_input("Re-enter final result: "))
print "The corresponding grade is", grade(result)
最佳答案
使用tokenize
、keyword
和collections
模块。
tokenize.generate_tokens(readline)
The generate_tokens() generator requires one argument, readline, which must be a callable object which provides the same interface as the readline() method of built-in file objects (see section File Objects). Each call to the function should return one line of input as a string. Alternately, readline may be a callable object that signals completion by raising StopIteration.
The generator produces 5-tuples with these members: the token type; the token string; a 2-tuple (srow, scol) of ints specifying the row and column where the token begins in the source; a 2-tuple (erow, ecol) of ints specifying the row and column where the token ends in the source; and the line on which the token was found. The line passed (the last tuple item) is the logical line; continuation lines are included.
New in version 2.2.
import tokenize
with open('source.py') as f:
print list(tokenize.generate_tokens(f.readline))
部分输出:
[(1, 'def', (1, 0), (1, 3), 'def grade(result):\n'),
(1, 'grade', (1, 4), (1, 9), 'def grade(result):\n'),
(51, '(', (1, 9), (1, 10), 'def grade(result):\n'),
(1, 'result', (1, 10), (1, 16), 'def grade(result):\n'),
(51, ')', (1, 16), (1, 17), 'def grade(result):\n'),
(51, ':', (1, 17), (1, 18), 'def grade(result):\n'),
(4, '\n', (1, 18), (1, 19), 'def grade(result):\n'),
(5, ' ', (2, 0), (2, 4), " '''\n"),
(3,
'\'\'\'\n if if (<--- example to test if the word "if" will be ignored in the counts)\n :param result: none\n :return:none\n \'\'\'',
(2, 4),
(6, 7),
' \'\'\'\n if if (<--- example to test if the word "if" will be ignored in the counts)\n :param result: none\n :return:none\n \'\'\'\n'),
(4, '\n', (6, 7), (6, 8), " '''\n"),
(54, '\n', (7, 0), (7, 1), '\n'),
(1, 'if', (8, 4), (8, 6), ' if result >= 80:\n'),
您可以从keyword
模块检索关键字列表:
import keyword
print keyword.kwlist
print keyword.iskeyword('def')
与集合的集成解决方案。计数器:
import tokenize
import keyword
import collections
with open('source.py') as f:
# tokens is lazy generator
tokens = (token for _, token, _, _, _ in tokenize.generate_tokens(f.readline))
c = collections.Counter(token for token in tokens if keyword.iskeyword(token))
print c # Counter({'elif': 3, 'print': 2, 'return': 1, 'else': 1, 'while': 1, 'or': 1, 'def': 1, 'if': 1})
关于python - 如何计算代码中关键字的出现次数,但忽略注释/文档字符串中的关键字?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30232478/