到目前为止,我已经这样做了,但它没有打印出我想要的结果。先感谢您。
$ sed -n -e "/\(*\)/g" c_comments | sed -n '/\/\*/p; /^ \*/p' c_comments |sed -n '/[[:blank:]]/p' c_comments
这是文本文件c_comments,我想提取c_style注释和C++注释。 //包含各种 C 和 C++ 风格注释示例的文件
/* simple C style comment on one line with no code */
x = 5*3; /* Example comment following code */
/* comments do not have to begin at the line beginning */
/* And you can have empty comments like the next one */
/**/
/* comments with code following the comment (not possible with C++ style)
*/ x = w * u/y/z;
// As shown below you can have what appear to comments embedded in a
string
// The line below should be counted as code
printf(" This output string looks like a /* comment */ doesn't it?\n");
/* ---- Example of a multiline
C style comment */
c++; // C ++ style comment following code
c=a/b*c; /* comment between two pieces of code */ w = a*b/e;
/* This is a multiline c style comment. This
comment covers several
lines.
------*/
a = b / c * d;
/* -----------End of the file ---------------*/
最佳答案
以下将处理您的特定文件。
#! /bin/sed -f
# using ':' for regex delimiter, as we are matching a lot of
# literal '/' in the following expressions
# remove quoted strings
# TODO: allow quotes in comments and detect multiline quoted strings
s:["][^"]*["]::g
# detect '/* ... */'
\:/\*.*\*/: {
# handle leading '//'
\://.*/\*: {
s:.*//://:
p;d
}
s:.*/\*:/*:
s:\*/.*:*/:
p;d
}
# detect '/* ... \n ... */'
# TODO: fix the '// ... /*' case
\:/\*:,\:\*/: {
s:.*/\*:/*:
s:\*/.*:*/:
p;d
}
# detect //
\://: {
s:.*//://:
p;d
}
d
以上是一个非示例而非示例 -- 展示一些在 sed 中确实很难做到的事情(特别注意 TODO
's).
因此,一般来说,使用单个 sed 脚本提取 C 注释在我看来并不是一个很好的选择——让它完全正确是非常困难的,而且结果很快就会变成一些非常晦涩的代码。
这里有一个替代方案,用 sed 装饰我们的 C 源代码,使用 awk 过滤它(考虑到多级 C 语法规则),然后删除用 sed 再次装饰:
c_decorate
#! /bin/sed -f
s:\r::g
s:/\*:\rC/*:g
s:\*/:*/\rE:g
s://:\rL//:g
s:":\rQ":g
c_filter
#! /usr/bin/awk -f
BEGIN {
RS = ORS = "\r"
lc=0 # State variable for continuing a C++ style Line comment
cc=0 # State variable for continuing a C style comment
qt=0 # quote-count
}
NR == 1 { print ""; next }
/^C/ { # Begin C-Style Comment
if (qt % 2)
next
if (lc) {
if ($0 ~ /\n/) {
lc = 0
sub(/\n.*/, "\n")
}
} else {
cc = 1
}
print
next
}
/^E/ { # End C-Style Comment
if (qt % 2)
next
if (lc) {
if ($0 ~ /\n/) {
lc = 0
sub(/\n.*/, "\n")
}
print
} else if (cc) {
cc = 0
if ($0 ~ /\n/)
print "\n"
else
print "E"
}
next
}
/^L/ { # Begin C++ Style Line Comment
if (qt % 2)
next
if (!cc) {
lc = 0
if ($0 ~ /\n/)
sub(/\n.*/, "\n")
else
lc = 1
}
print
next
}
/^Q/ { # Quote
if (lc || cc)
print
else
qt++
next
}
c_cleanup
#! /bin/sed -f
$ {
/^$/ d
}
s:\r[CELQ]\?::g
并调用:
$ c_decorate c_comments | c_filter | c_cleanup
Awk 比 sed 更自然地适用于过滤,因为它本身支持记录分隔符的更改,并且更容易指定和推理任意逻辑关系。
要去除注释标签,这里是 c_decorate 的替代版本:
#! /bin/sed -f
s:\r::g
s:/\*:\rC:g
s:\*/:\rE:g
s://:\rL:g
s:":\rQ":g
更新 9/2019 (@Russ) 这似乎不能很好地处理注释中的“引号”,或嵌入 C++ 单行注释中的 C 风格注释,如
//* this is not handled well.
/* nor "is" this. */
因此,我将其用于 c_decorate:
#! /bin/sed -f
s:\r::g
# trouble is //* first matches //, then matches /*
s:[^/]/\*:\rC/*:g
s:^/\*:\rC/*:g
s:\*/:*/\rE:g
s://\+:\rL&:g
# does not handle quotes w/in comments
# s:":\rQ":g
关于linux - 使用 sed 显示 C 风格注释和 C++ 注释,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49521304/