c# - 在 C# 中查看一个字符串是否包含另一个字符串的最快、不区分大小写的方法是什么？

编辑 2:

确认我的性能问题是由于对 StringExtensions 类的静态函数调用造成的。删除后，IndexOf 方法确实是完成此操作的最快方法。

在 C# 中查看一个字符串是否包含另一个字符串的最快、不区分大小写的方法是什么？我在 Case insensitive 'Contains(string)' 看到该帖子的公认解决方案但我已经做了一些初步的基准测试，似乎使用该方法会导致在找不到测试字符串时对较大字符串(> 100 个字符)的调用速度降低几个数量级。

以下是我知道的方法:

索引:

public static bool Contains(this string source, string toCheck, StringComparison comp)
{
    if (string.IsNullOrEmpty(toCheck) || string.IsNullOrEmpty(source))
        return false;

    return source.IndexOf(toCheck, comp) >= 0;
}

到上层:

source.ToUpper().Contains(toCheck.ToUpper());

正则表达式:

bool contains = Regex.Match("StRiNG to search", "string", RegexOptions.IgnoreCase).Success;

所以我的问题是，平均而言哪种方式确实最快，为什么会这样？

编辑:

这是我用来强调性能差异的简单测试应用程序。使用它，我看到 ToLower() 为 16 毫秒，ToUpper 为 18 毫秒，StringExtensions.Contains() 为 140 毫秒:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Globalization;

namespace ScratchConsole
{
    class Program
    {
    static void Main(string[] args)
    {
        string input = "";
        while (input != "exit")
        {
            RunTest();
            input = Console.ReadLine();
        }
    }

    static void RunTest()
    {
        List<string> s = new List<string>();
        string containsString = "1";
        bool found;
        DateTime now;
        for (int i = 0; i < 50000; i++)
        {
            s.Add("AAAAAAAAAAAAAAAA AAAAAAAAAAAA");
        }

        now = DateTime.Now;
        foreach (string st in s)
        {
            found = st.ToLower().Contains(containsString);
        }
        Console.WriteLine("ToLower(): " + (DateTime.Now - now).TotalMilliseconds);

        now = DateTime.Now;
        foreach (string st in s)
        {
            found = st.ToUpper().Contains(containsString);
        }
        Console.WriteLine("ToUpper(): " + (DateTime.Now - now).TotalMilliseconds);


        now = DateTime.Now;
        foreach (string st in s)
        {
            found = StringExtensions.Contains(st, containsString, StringComparison.OrdinalIgnoreCase);
        }
        Console.WriteLine("StringExtensions.Contains(): " + (DateTime.Now - now).TotalMilliseconds);

    }
}

public static class StringExtensions
{
    public static bool Contains(this string source, string toCheck, StringComparison comp)
    {
        return source.IndexOf(toCheck, comp) >= 0;
    }
}

最佳答案

由于 ToUpper 实际上会导致创建一个新字符串，因此 StringComparison.OrdinalIgnoreCase 会更快，而且，正则表达式对于像这样的简单比较有很多开销。也就是说， String.IndexOf(String, StringComparison.OrdinalIgnoreCase) 应该是最快的，因为它不涉及创建新字符串。

我猜(我又来了)RegEx 有更好的最坏情况，因为它如何评估字符串，IndexOf 将始终进行线性搜索，我猜(再一次)RegEx 使用了一些东西更好的。 RegEx 也应该有一个最好的情况，它可能接近，但不如 IndexOf 好(由于它的语言更复杂)。

15,000 length string, 10,000 loop

00:00:00.0156251 IndexOf-OrdinalIgnoreCase
00:00:00.1093757 RegEx-IgnoreCase 
00:00:00.9531311 IndexOf-ToUpper 
00:00:00.9531311 IndexOf-ToLower

Placement in the string also makes a huge difference:

At start:
00:00:00.6250040 Match
00:00:00.0156251 IndexOf
00:00:00.9687562 ToUpper
00:00:01.0000064 ToLower

At End:
00:00:00.5781287 Match
00:00:01.0468817 IndexOf
00:00:01.4062590 ToUpper
00:00:01.4218841 ToLower

Not Found:
00:00:00.5625036 Match
00:00:01.0000064 IndexOf
00:00:01.3750088 ToUpper
00:00:01.3906339 ToLower

关于c# - 在 C# 中查看一个字符串是否包含另一个字符串的最快、不区分大小写的方法是什么？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7759902/

c# - 在 C# 中查看一个字符串是否包含另一个字符串的最快、不区分大小写的方法是什么？

上一篇：C# 错误 : The call is ambiguous between the following methods or properties. 运算符重载

下一篇：c# - 只查找非继承接口(interface)？