用于提取java注释的python正则表达式

我正在使用 Python 解析 Java 源代码。我需要从源中提取评论文本。我已经尝试过以下方法。

取1:

cmts = re.findall(r'/\*\*(.|[\r\n])*?\*/',lines)

返回:空白 [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']

采取 2:(在正则表达式周围添加分组括号)

cmts = re.findall(r'(/\*\*(.|[\r\n])*?\*/)',lines)

单行注释(仅示例):

('/**\n\n * 使用颜色和标签名称初始化标签\n\n */', ' ')

多行注释(仅示例):

('/**\n\n * 获取与指定标签相关的颜色\n\n * @param tag 我们想要获取其颜色的标签\n\n * @return color String 中标签的\n\n */', ' ')

我只对使用颜色和标签名称初始化标签感兴趣 或 获取与指定标签相关的颜色，@param tag 我们想要获取颜色的标签，@return String 中标签的颜色 我无法理解它。请大家多多指点!

最佳答案

要提取注释(/** 和 */ 之间的所有内容)，您可以使用:

re.findall(r'\*\*(.*?)\*\/', text, re.S)

(请注意，如果使用 re.S/re.DOTALL，当点也匹配换行符时，如何简化捕获组)。

然后，对于每个匹配，您可以去掉多个空格/*，并将 \n 替换为 ,:

def comments(text):
    for comment in re.findall(r'\*\*(.*?)\*\/', text, re.S):
        yield re.sub('\n+', ',', re.sub(r'[ *]+', ' ', comment).strip())

例如:

>>> list(comments('/**\n\n     * Get the color related to a specified tag\n\n     * @param tag the tag that we want to get the colour for\n\n     * @return color of the tag in String\n\n     */'))
['Get the color related to a specified tag, @param tag the tag that we want to get the colour for, @return color of the tag in String']

关于用于提取java注释的python正则表达式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47481588/

用于提取java注释的python正则表达式

上一篇：python - 改变python tkinter中变量的值

下一篇：Python Sqlite3 将 BLOB 传递给用户定义的函数给出 None