c# - String.Trim() 删除的所有字符的列表?

标签 c# .net

显然,Trim 的主要用途是从字符串中删除开头和结尾的空格,例如:

"  hello  ".Trim(); // results in "hello"

但 Trim 也会删除额外的字符,如 \n\r\t,因此:

"  \nhello\r\t  ".Trim(); // it also produces "hello"

是否有 Trim 将删除的所有字符的明确列表(最好是字符串转义格式,如 \n)?

编辑:感谢您的详细回答 - 我现在知道确切的字符。这Wikipedia list that @RayKoopa left in comments对我来说可能是最好看的格式。

最佳答案

我们可以看一下String类的源代码here

公共(public) Trim() 方法调用名为 TrimHelper() 的内部辅助方法:

 public String Trim() {
        Contract.Ensures(Contract.Result<String>() != null);
        Contract.EndContractBlock();

        return TrimHelper(TrimBoth);        
 }

TrimHelper() 看起来像这样:

[System.Security.SecuritySafeCritical]  // auto-generated
        private String TrimHelper(int trimType) {
            //end will point to the first non-trimmed character on the right
            //start will point to the first non-trimmed character on the Left
            int end = this.Length-1;
            int start=0;

            //Trim specified characters.
            if (trimType !=TrimTail)  {
                for (start=0; start < this.Length; start++) {
                    if (!Char.IsWhiteSpace(this[start]) && !IsBOMWhitespace(this[start])) break;
                }
            }

            if (trimType !=TrimHead) {
                for (end= Length -1; end >= start;  end--) {
                    if (!Char.IsWhiteSpace(this[end])  && !IsBOMWhitespace(this[start])) break;
                }
            }

            return CreateTrimmedString(start, end);
        }

所以你的大部分问题基本上在于检查 Char.IsWhiteSpace 方法,

char.cs

   [Pure]
    public static bool IsWhiteSpace(char c) {

        if (IsLatin1(c)) {
            return (IsWhiteSpaceLatin1(c));
        }
        return CharUnicodeInfo.IsWhiteSpace(c);
    }

如果它是一个拉丁字符,那么这就是构成空白的原因:

 private static bool IsWhiteSpaceLatin1(char c) {

            // There are characters which belong to UnicodeCategory.Control but are considered as white spaces.
            // We use code point comparisons for these characters here as a temporary fix.

            // U+0009 = <control> HORIZONTAL TAB
            // U+000a = <control> LINE FEED
            // U+000b = <control> VERTICAL TAB
            // U+000c = <contorl> FORM FEED
            // U+000d = <control> CARRIAGE RETURN
            // U+0085 = <control> NEXT LINE
            // U+00a0 = NO-BREAK SPACE
            if ((c == ' ') || (c >= '\x0009' && c <= '\x000d') || c == '\x00a0' || c == '\x0085') {
                return (true);
            }
            return (false);
        }

否则我们必须去CharUnicodeInfo.cs ,它使用枚举来检查空白字符

   internal static bool IsWhiteSpace(char c)
        {
            UnicodeCategory uc = GetUnicodeCategory(c);
            // In Unicode 3.0, U+2028 is the only character which is under the category "LineSeparator".
            // And U+2029 is th eonly character which is under the category "ParagraphSeparator".
            switch (uc) {
                case (UnicodeCategory.SpaceSeparator):
                case (UnicodeCategory.LineSeparator):
                case (UnicodeCategory.ParagraphSeparator):
                    return (true);
            }

            return (false);
        }

关于c# - String.Trim() 删除的所有字符的列表?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37333250/

相关文章:

c# - 罗斯林编译 : type is defined in an assembly that is not referenced

c# - 使用 HttpClient 时如何在 HttpContent 中设置大字符串?

c# - 家庭和人民的代表

c# - Fody NuGet 包如何在编译过程结束时合并程序集?

c# - 使用 WebClient 类可恢复上传

c# - 枚举 Outlook ContactItem 属性

c# - 将引用字符串或引用类传递给函数是否有充分的理由?

.net - LINQ 的可扩展性如何?

c# - 在彼此中实例化两个类中的每一个

c# - 组播提要 : use localhost?