c# - 条件逻辑字符串的解析和正则表达式

标签 c# regex string parsing string-parsing

我必须标记条件字符串表达式:

Aritmetic operators are = +, -, *, /, %

Boolean operators are = &&, ||

Conditional Operators are = ==, >=, >, <, <=, <,!=

示例表达式是 = (x+3>5*y)&&(z>=3 || k!=x)

我想要的是标记这个字符串 = 运算符 + 操作数。

由于“>”和“>=”以及“=”和“!=”[包含相同的字符串],我在标记化方面遇到了问题。

PS1:我不想做复杂的词法分析。只是简单解析 如果可能,使用正则表达式。

PS2: 或者换句话说,我寻找给定的正则表达式 没有空格的示例表达式 =

(x+3>5*y)&&(z>=3 || k!=x) 

并且会生成每个标记,并用空格分隔,例如:

( x + 3 > 5 * y ) && ( z >= 3 || k != x )

最佳答案

不是正则表达式,而是一个可能正常工作的基本分词器(请注意,您不需要执行 string.Join - 您可以通过 IEnumerable<string> 使用 foreach):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
static class Program
{
    static void Main()
    {
        // and will produce each token is separated with a white space like : ( x + 3 > 5 * y ) && ( z >= 3 || k != x )
        string recombined = string.Join(" ", Tokenize("(x+3>5*y)&&(z>=3 || k!=x)"));
        // output: ( x + 3 > 5 * y ) && ( z >= 3 || k != x )
    }
    public static IEnumerable<string> Tokenize(string input)
    {
        var buffer = new StringBuilder();
        foreach (char c in input)
        {
            if (char.IsWhiteSpace(c))
            {
                if (buffer.Length > 0)
                {
                    yield return Flush(buffer);
                }
                continue; // just skip whitespace
            }

            if (IsOperatorChar(c))
            {
                if (buffer.Length > 0)
                {
                    // we have back-buffer; could be a>b, but could be >=
                    // need to check if there is a combined operator candidate
                    if (!CanCombine(buffer, c))
                    {
                        yield return Flush(buffer);
                    }
                }
                buffer.Append(c);
                continue;
            }

            // so here, the new character is *not* an operator; if we have
            // a back-buffer that *is* operators, yield that
            if (buffer.Length > 0 && IsOperatorChar(buffer[0]))
            {
                yield return Flush(buffer);
            }

            // append
            buffer.Append(c);
        }
        // out of chars... anything left?
        if (buffer.Length != 0)
            yield return Flush(buffer);
    }
    static string Flush(StringBuilder buffer)
    {
        string s = buffer.ToString();
        buffer.Clear();
        return s;
    }
    static readonly string[] operators = { "+", "-", "*", "/", "%", "=", "&&", "||", "==", ">=", ">", "<", "<=", "!=", "(",")" };
    static readonly char[] opChars = operators.SelectMany(x => x.ToCharArray()).Distinct().ToArray();

    static bool IsOperatorChar(char newChar)
    {
        return Array.IndexOf(opChars, newChar) >= 0;
    }
    static bool CanCombine(StringBuilder buffer, char c)
    {
        foreach (var op in operators)
        {
            if (op.Length <= buffer.Length) continue;
            // check starts with same plus this one
            bool startsWith = true;
            for (int i = 0; i < buffer.Length; i++)
            {
                if (op[i] != buffer[i])
                {
                    startsWith = false;
                    break;
                }
            }
            if (startsWith && op[buffer.Length] == c) return true;
        }
        return false;
    }

}

关于c# - 条件逻辑字符串的解析和正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17543071/

相关文章:

c# - 连接到 Redis 缓存服务器时出错

javascript - 在 webApi Controller 中实现 Onfailure

regex - 在长字符串中插入换行符

string - Golang追加和删除字符的优化方法是什么

iPhone stringWithCString 已弃用

c# - 如何在不损失质量的情况下将 jpeg 转换为字节数组

c# - 泛型类的静态成员是否绑定(bind)到特定实例?

c# - 正则表达式替换可能会或可能不会被引用的字符串

php - 需要用 RegExp 选择 2 个字母

Android:JSON 字符串数据与其他字符串比较