javascript - 正则表达式选择不起作用

标签 javascript regex

我有一个脚本,可以自动设置提词器脚本的格式。它应该将所有内容都大写(除了某些异常(exception))。但是,它也应该单独保留尖括号或方括号以及圆括号中的任何内容。

这是我创建的代码:

<script>
String.prototype.smartUpperCase = function(){
    var pattern = /(.*?[a-z][A-Z])(.*)/g;
    if(pattern.test(this)){
        return this.replace(pattern,function(t,a,b){
            return a+b.toUpperCase();
        });
    }
    else{
        return this.toUpperCase();
    }
}
String.prototype.regexEscape = function(){ return this.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&"); }
String.prototype.removeBrackets = function(){ return this.replace(/[\<\>\[\]\(\)]/g, ""); }
String.prototype.format = function(returnValNoShow){
    text = this;
    orig = text; // for use in multi-line regex pattern
    text = text.replace(/(\w+)/g,function(t,w){ return w.smartUpperCase(); }); // smart uppercase everything
    text = text.replace(/\d{1,2}[st|nd|rd|th]{2}/gi, function(m){ return m.toLowerCase(); } ); // for dates (1st, 2nd, etc. will be lowecase)
    // complicated regex -> find anything inside <>, [], () and inject the original string back in
    var pattern = /.*(?=[^\<]*\>|[^\[]*\]|[^\(]*\)).*/g;
    text = text.replace( pattern, function(match){
        console.log(match);
        if(match==""){ return ""; }
        var pattern2 = new RegExp(".*(?="+match.regexEscape()+").*", "gi");
        //console.log(orig.match(pattern2));
        return orig.match(pattern2)[0];
    });

    text = text.replace(/\&/g, "AND"); // switch & for and

    text = text.replace(/ +/g, " "); // replace multiple spaces with one
    text = text.replace(/\n{3,}/g, "\n\n"); // replace 3+ line breaks with two
    text = text.replace(/\}\n{2,}/g, "}\n"); // don't allow empty line after name
    text = text.replace(/\n{2,}-+\n{2,}/g, "\n---\n"); // don't allow blank line between break (---)

    text = text.replace(/\n /g, "\n").replace(/ \n/g, "\n"); // trim() each line

    text = text.trim(); // trim whitespace on ends
    return text;
}
function f() {
    document.getElementById("in").value = document.getElementById("in").value.format();
}
</script>

并且 HTML 足够简单:

<textarea id="in" rows="40" cols="80">{NAME}
THANKS ____ AND ____. AS WE REPORTED LAST MONDAY, BATMAN VS SUPERMAN: DAWN OF JUSTICE CAME OUT THIS PAST WEEKEND AND IT SET SOME BOX OFFICE RECORDS.

{NAME}
(DDR) That's right ____. 'Batman v Superman' took huge $170 million at the box office. Audiences flocked to see the pairing of Batman (Ben Affleck) versus Superman (Henry Cavill) in the DC Comics film, which also introduced Wonder Woman (Gal Gadot).

{NAME}
IT'S THE BIGGEST MARCH OPENING WEEKEND EVER, EVEN BEATING 2012'S THE HUNGER GAMES' WHO BROUGHT IN $152.5 MILLION.

{NAME}
IN OTHER NEWS - SYRACUSE IS THE FIRST 10 SEED TO MAKE IT TO THE FINAL FOUR.

(ad lib)
</textarea>
<br/>
<input type="button" onclick="f()" value="Format"/>

99% 的情况下这都会按预期工作。但是,正如第二段所示,它有时不会执行任何操作。

(文本区域中的文本已经经过格式化)

最佳答案

第一个问题是你的“在括号中查找内容”正则表达式:

var pattern = /.*(?=[^\<]*\>|[^\[]*\]|[^\(]*\)).*/g; //wrong

匹配整个字符串:模式的相关部分包含在零宽度的“lookahead”断言中,仅用作 bool 值是/否。您需要以消耗模式主动匹配这些序列(同时也不要通过删除 .* 来吃掉字符串的其余部分),以便可以正确替换它们:

var pattern = /(\([^\(]*\)|\{[^\{]*\}|\[[^\[]*\])/g;

当您构建用于与原始文本匹配的替换模式时,会再次遇到此问题:

var pattern2 = new RegExp(".*(?="+match.regexEscape()+").*", "gi"); //wrong

这再次向前看匹配,但它被.*通配符序列包围,所以如果有匹配,它将是整个字符串。将其更改为:

var pattern2 = new RegExp(match.regexEscape(), "gi")

现在,当您进行替换时,它会像您希望的那样工作... this demo shows your code working as intended .

关于javascript - 正则表达式选择不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36287489/

相关文章:

javascript - 为什么 Typescript lambda 函数会被大括号破坏?

php - 如何在javascript中获取html输入文件类型属性?

javascript - 保持警报消息在不同网页之间显示

javascript - 需要检索正文中的所有主要 div

python - 在python中替换不同长度的数字(re.sub)

正则表达式: "(.)+\1"如何工作?

javascript - 如何避免在此循环中创建函数?

c# - 从工程符号转换为double和REGEX,将一个字符串分成两部分;数字和字符

小数点后两位的 Javascript 正则表达式模式

python - 正则表达式 - 函数体提取