c# - 哪个正则表达式能够在 C# 的公式中选择 excel 列名称?

标签 c# regex excel

我需要在 C# 中实现 excel 公式自动填充。

假设这个公式位于 B100:

=SUM($B$99:B99)

我想让这个公式在 C100 时有所不同:

=SUM($B$99:C99)

这个公式只是一个例子。一些真实的例子是:

=(SUM($B${0}:B{0})/SUM({1}!$B${0}:{1}!B{0}) -1)

=SUM(B{0}:B{1})

=B{0} + B{1}

=C{0}+ B{1}

=$B${0}+ AC{1}

(考虑 {0} 和 {1} 实际上是数字)

我需要做的,一般来说,是选择这些列名并“增加”它们。不应更新公式中被 $ 包围的列名。

如何用正则表达式识别这些字段?

最佳答案

这是一个只处理公式的正则表达式解决方案。我会把 Excel 的东西留给你。如果您有一组代表您的公式的字符串,您可以通过它来运行它们以增加您的列名。

一些评论:

  • **对此进行彻底测试!**也许手动制作一张表格并将您的努力与生成的结果进行比较。
  • 这不应该意外地更改适合单元格命名模式的函数名称。如果您知道您的公式有包含数字的 Excel 函数名称,请留意它们并再次**验证结果**。
  • 正则表达式无法验证您喂的是公式 - 我认为您只是在使用公式。换句话说,我没有让它检查字符串是否以“=”符号开头。如果您计划通过它为其他单元格值提供非公式,则使用 formula.StartsWith("=") 添加检查 if 分支中使用 IsMatch 的位置。要理解我指的是什么,请在我的示例中添加一个额外的测试字符串,例如“检查 T4 生成”——如果没有进行 StartsWith("=") 检查,它将匹配并且 T4 将变为 U4。<

正则表达式模式实际上是最简单的部分。它只会匹配任何字母数字序列,并忽略 $A$1 和 $A1 类型的单元格。棘手的部分是增加列的逻辑。我添加了评论来澄清这一点,所以喝杯咖啡并仔细阅读 :)

我确信这可以得到增强,但这是我有时间做的。

using System.Text.RegularExpressions;

static void Main(string[] args)
{
    string[] formulas = { "Z1", "ZZ1", "AZ1", "AZB1", "BZZ2",
                        "=SUM($B$99:B99)","=SUM($F99:F99)", "=(SUM($B$0:B0)/SUM(1!$B$11:22!B33) -1)",
                        "=SUM(X80:Z1)", "=A0 + B1 - C2 + Z5", "=C0+ B1",
                        "=$B$0+ AC1", "=AA12-ZZ34 + AZ1 - BZ2 - BX3 + BZX4",
                        "=SUMX2MY2(A2:A8,B2:B8)",   // ensure function SUMX2MY2 isn't mistakenly incremented
                        "=$B$40 + 50 - 20"          // no match
                        //,"Check out T4 generation!"  // not a formula but it'll still increment T4, use formula.StartsWith("=")
                        };

    // use this if you don't want to include regex comments
    //Regex rxCell = new Regex(@"(?<![$])\b(?<col>[A-Z]+)(?<row>\d+)\b");

    // regex comments in this style requires RegexOptions.IgnorePatternWhitespace
    string rxCellPattern = @"(?<![$])       # match if prefix is absent: $ symbol (prevents matching $A1 type of cells)
                                            # (if all you have is $A$1 type of references, and not $A1 types, this negative look-behind isn't needed)
                            \b              # word boundary (prevents matching Excel functions with a similar pattern to a cell)
                            (?<col>[A-Z]+)  # named capture group, match uppercase letter at least once
                                            # (change to [A-Za-z] if you have lowercase cells)
                            (?<row>\d+)     # named capture group, match a number at least once
                            \b              # word boundary
                            ";
    Regex rxCell = new Regex(rxCellPattern, RegexOptions.IgnorePatternWhitespace);

    foreach (string formula in formulas)
    {
        if (rxCell.IsMatch(formula))
        {
            Console.WriteLine("Formula: {0}", formula);
            foreach (Match cell in rxCell.Matches(formula))
                Console.WriteLine("Cell: {0}, Col: {1}", cell.Value, cell.Groups["col"].Value);

            // the magic happens here
            string newFormula = rxCell.Replace(formula, IncrementColumn);
            Console.WriteLine("Modified: {0}", newFormula);
        }
        else
        {
            Console.WriteLine("Not a match: {0}", formula);
        }
        Console.WriteLine();
    }
}


private static string IncrementColumn(Match m)
{
    string col = m.Groups["col"].Value;
    char c;

    // single character column name (ie. A1)
    if (col.Length == 1)
    {
        c = Convert.ToChar(col);
        if (c == 'Z')
        {
            // roll over
            col = "AA";
        }
        else
        {
            // advance to next char
            c = (char)((int)c + 1);
            col = c.ToString();
        }
    }
    else
    {
        // multi-character column name (ie. AB1)
        // in this case work backwards to do some column name "arithmetic"
        c = Convert.ToChar(col.Substring(col.Length - 1, 1));   // grab last letter of col

        if (c == 'Z')
        {
            string temp = "";
            for (int i = col.Length - 1; i >= 0; i--)
            {
                // roll over should occur
                if (col[i] == 'Z')
                {
                    // prepend AA if current char is not the last char in column and its next neighbor was also a Z
                    // ie. column BZZ: if current char is 1st Z, it's neighbor Z (2nd Z) just got incremented, so 1st Z becomes AA
                    if (i != col.Length - 1 && col[i + 1] == 'Z')
                    {
                        temp = "AA" + temp;
                    }
                    else
                    {
                        // last char in column is Z, becomes A (this will happen first, before the above if branch ever happens)
                        temp = "A" + temp;
                    }
                }
                else
                {
                    temp = ((char)((int)col[i] + 1)).ToString() + temp;
                }
            }
            col = temp;
        }
        else
        {
            // advance char
            c = (char)((int)c + 1);
            // chop off final char in original column, append advanced char
            col = col.Remove(col.Length - 1) + c.ToString();
        }
    }

    // updated column and original row (from regex match)
    return col + m.Groups["row"].Value;
}

结果应如下所示(为简洁起见,我删除了单元格分解):

Formula: Z1
Modified: AA1

Formula: ZZ1
Modified: AAA1

Formula: AZ1
Modified: BA1

Formula: AZB1
Modified: AZC1

Formula: BZZ2
Modified: CAAA2

Formula: =SUM($B$99:B99)
Modified: =SUM($B$99:C99)

Formula: =SUM($F99:F99)
Modified: =SUM($F99:G99)

Formula: =(SUM($B$0:B0)/SUM(1!$B$11:22!B33) -1)
Modified: =(SUM($B$0:C0)/SUM(1!$B$11:22!C33) -1)

Formula: =SUM(X80:Z1)
Modified: =SUM(Y80:AA1)

Formula: =A0 + B1 - C2 + Z5
Modified: =B0 + C1 - D2 + AA5

Formula: =C0+ B1
Modified: =D0+ C1

Formula: =$B$0+ AC1
Modified: =$B$0+ AD1

Formula: =AA12-ZZ34 + AZ1 - BZ2 - BX3 + BZX4
Modified: =AB12-AAA34 + BA1 - CA2 - BY3 + BZY4

Formula: =SUMX2MY2(A2:A8,B2:B8)
Modified: =SUMX2MY2(B2:B8,C2:C8)

Not a match: =$B$40 + 50 - 20

关于c# - 哪个正则表达式能够在 C# 的公式中选择 excel 列名称?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/854827/

相关文章:

c# - 如何驱动其他类中的表单事件?

c# - 如何将 MediatR PublishStrategy 添加到现有项目

regex - 如何查找一个单词是否包含模式字符的排列?

excel - 在 VBA 中制作具有命名范围的图表

python - Pandas 在比较两列不同工作表后创建一列

c# - C# 上的 WaitForSeconds

c# - Dictionary.Count 性能

正则表达式匹配排序

java - 正则表达式删除除特定格式之外的所有内容

python - 通过 Pandas 从多级 Excel 文件整理数据