javascript - 正则表达式正在捕获整个字符串

标签 javascript regex

我正在使用以下正则表达式:

(public|private +)?function +([a-zA-Z_$][0-9a-zA-Z_$]*) *\\(([0-9a-zA-Z_$, ]*)\\) *{(.*)}

要匹配下面的字符串:

public function messenger(text){
sendMsg(text);
}
private function sendMsg(text){
alert(text);
}

(字符串中没有换行符,它们在正则表达式运行之前被转换为空格)

我希望它捕获这两个函数,但它正在捕获: 1美元:“” $2:“信使” $3:“文本” $4: "sendMsg(text); } private function sendMsg(text){ alert(text); "

顺便说一下,我正在使用 Javascript。

最佳答案

因为您在另一个线程中接受了我的(错误的)答案,我觉得自己有义务发布一个适当的解决方案。这不会是快速和简短的,但希望能有所帮助。

如果必须的话,下面是我将如何为类 C 语言编写基于正则表达式的解析器。

<script>
/* 
Let's start with this simple utility function. It's a
kind of stubborn version of String.replace() - it
checks the string over and over again, until nothing
more can be replaced
*/

function replaceAll(str, regexp, repl) {
    str = str.toString();
    while(str.match(regexp))
        str = str.replace(regexp, repl);
    return str;
}

/*
Next, we need a function that removes specific
constructs from the text and replaces them with
special "markers", which are "invisible" for further
processing. The matches are collected in a buffer so
that they can be restored later.
*/

function isolate(type, str, regexp, buf) {
    return replaceAll(str, regexp, function($0) {
        buf.push($0);
        return "<<" + type + (buf.length - 1) + ">>";
    });
} 

/*
The following restores "isolated" strings from the
buffer:
*/

function restore(str, buf) {
    return replaceAll(str, /<<[a-z]+(\d+)>>/g, function($0, $1) {
        return buf[parseInt($1)];
    });
}

/*
Write down the grammar. Javascript regexps are
notoriously hard to read (there is no "comment"
option like in perl), therefore let's use more
readable format with spacing and substitution
variables. Note that "$string" and "$block" rules are
actually "isolate()" markers.
*/

var grammar = {
    $nothing: "",
    $space:  "\\s",
    $access: "public $space+ | private $space+ | $nothing",
    $ident:  "[a-z_]\\w*",
    $args:   "[^()]*",
    $string: "<<string [0-9]+>>",
    $block:  "<<block [0-9]+>>",
    $fun:    "($access) function $space* ($ident) $space* \\( ($args) \\) $space* ($block)"
}

/*
This compiles the grammar to pure regexps - one for
each grammar rule:
*/

function compile(grammar) {
    var re = {};
    for(var p in grammar)
        re[p] = new RegExp(
            replaceAll(grammar[p], /\$\w+/g, 
                    function($0) { return grammar[$0] }).
            replace(/\s+/g, ""), 
        "gi");
    return re;
}

/*
Let's put everything together
*/

function findFunctions(code, callback) {
    var buf = [];

    // isolate strings
    code = isolate("string", code, /"(\\.|[^\"])*"/g, buf);

    // isolate blocks in curly brackets {...}
    code = isolate("block",  code, /{[^{}]*}/g, buf);

    // compile our grammar
    var re = compile(grammar);

    // and perform an action for each function we can find
    code.replace(re.$fun, function() {
        var p = [];
        for(var i = 1; i < arguments.length; i++)
            p.push(restore(arguments[i], buf));
        return callback.apply(this, p)
    });
}
</script>

现在我们准备好进行测试了。我们的解析器必须能够处理转义字符串和任意嵌套 block 。

<code>
public function blah(arg1, arg2) {
    if("some string" == "public function") {
        callAnother("{hello}")
        while(something) {
            alert("escaped \" string");
        }
    }
}

function yetAnother() { alert("blah") }
</code>

<script>
window.onload = function() {
    var code = document.getElementsByTagName("code")[0].innerHTML;
    findFunctions(code, function(access, name, args, body) {
        document.write(
            "<br>" + 
            "<br> access= " + access +
            "<br> name= "   + name +
            "<br> args= "   + args +
            "<br> body= "   + body
        )
    });
}
</script> 

关于javascript - 正则表达式正在捕获整个字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2588994/

相关文章:

javascript - 菜单按钮效果

javascript - Chrome PDF Viewer 嵌入式下载按钮不起作用

regex - 自动正则表达式生成器

php - Mysql - 查找以字符串开头的短语并删除其后的所有内容

Java正则表达式获取字符串的一部分

java - 需要帮助使用正则表达式在 java 中拆分字符串

javascript - jQuery 在表循环中获取输入值

javascript - 如何在 anchor 标记 (<a>) 中同时执行 onClick 和 href

JavaScript - 如何停止鼠标移动

regex - Tcl 正则表达式匹配所有小写的字符串