希望大家能帮帮我。 我正在使用 C# .Net 4.0
我想验证文件结构,例如
const string dataFileScr = @"
Start 0
{
Next = 1
Author = rk
Date = 2011-03-10
/* Description = simple */
}
PZ 11
{
IA_return()
}
GDC 7
{
Message = 6
Message = 7
Message = 8
Message = 8
RepeatCount = 2
ErrorMessage = 10
ErrorMessage = 11
onKey[5] = 6
onKey[6] = 4
onKey[9] = 11
}
";
到目前为止,我成功构建了这个正则表达式模式
const string patternFileScr = @"
^
((?:\[|\s)*
(?<Section>[^\]\r\n]*)
(?:\])*
(?:[\r\n]{0,}|\Z))
(
(?:\{) ### !! improve for .ini file, dont take {
(?:[\r\n]{0,}|\Z)
( # Begin capture groups (Key Value Pairs)
(?!\}|\[) # Stop capture groups if a } is found; new section
(?:\s)* # Line with space
(?<Key>[^=]*?) # Any text before the =, matched few as possible
(?:[\s]*=[\s]*) # Get the = now
(?<Value>[^\r\n]*) # Get everything that is not an Line Changes
(?:[\r\n]{0,})
)* # End Capture groups
(?:[\r\n]{0,})
(?:\})?
(?:[\r\n\s]{0,}|\Z)
)*
";
和 C#
<strong>Dictionary <string, Dictionary<string, string>> DictDataFileScr</strong>
= (from Match m in Regex.Matches(dataFileScr,
patternFileScr,
RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline)
select new
{
Section = m.Groups["Section"].Value,
kvps = (from cpKey in m.Groups["Key"].Captures.Cast().Select((a, i) => new { a.Value, i })
join cpValue in m.Groups["Value"].Captures.Cast().Select((b, i) => new { b.Value, i }) on cpKey.i equals cpValue.i
select new KeyValuePair(cpKey.Value, cpValue.Value)).OrderBy(_ => _.Key)
.ToDictionary(kvp => kvp.Key, kvp => kvp.Value)
}).ToDictionary(itm => itm.Section, itm => itm.kvps);
它适用于
const string dataFileScr = @"
Start 0
{
Next = 1
Author = rk
Date = 2011-03-10
/* Description = simple */
}
GDC 7
{
Message = 6
RepeatCount = 2
ErrorMessage = 10
onKey[5] = 6
onKey[6] = 4
onKey[9] = 11
}
";
换句话说
Section1
{
key1=value1
key2=value2
}
Section2
{
key1=value1
key2=value2
}
但是
DictDataFileScr["GDC 7"]["Message"] = "6|7|8|8"
DictDataFileScr["GDC 7"]["ErrorMessage"] = "10|11"
....
[Section1]
key1 = value1
key2 = value2
[Section2]
key1 = value1
key2 = value2
...
....
PZ 11
{
IA_return()
}
.....
最佳答案
这是用 C# 对正则表达式进行的完整修改。
假设:(告诉我其中一个是假的还是全部都是假的)
- INI 文件部分的正文中只能包含键/值对行
- 在非 INI 文件部分中,函数调用不能带有任何参数
正则表达式标志:
RegexOptions.IgnoreCase | 正则表达式选项RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled | 正则表达式选项RegexOptions.Singleline
输入测试:
const string dataFileScr = @"
Start 0
{
Next = 1
Author = rk
Date = 2011-03-10
/* Description = simple */
}
PZ 11
{
IA_return()
}
GDC 7
{
Message = 6
Message = 7
Message = 8
Message = 8
RepeatCount = 2
ErrorMessage = 10
ErrorMessage = 11
onKey[5] = 6
onKey[6] = 4
onKey[9] = 11
}
[Section1]
key1 = value1
key2 = value2
[Section2]
key1 = value1
key2 = value2
";
重新设计的正则表达式:
const string patternFileScr = @"
(?<Section> (?# Start of a non ini file section)
(?<SectionName>[\w ]+)\s* (?# Capture section name)
{ (?# Match but don't capture beginning of section)
(?<SectionBody> (?# Capture section body. Section body can be empty)
(?<SectionLine>\s* (?# Capture zero or more line(s) in the section body)
(?: (?# A line can be either a key/value pair, a comment or a function call)
(?<KeyValuePair>(?<Key>[\w\[\]]+)\s*=\s*(?<Value>[\w-]*)) (?# Capture key/value pair. Key and value are sub-captured separately)
|
(?<Comment>/\*.+?\*/) (?# Capture comment)
|
(?<FunctionCall>[\w]+\(\)) (?# Capture function call. A function can't have parameters though)
)\s* (?# Match but don't capture white characters)
)* (?# Zero or more line(s), previously mentionned in comments)
)
} (?# Match but don't capture beginning of section)
)
|
(?<Section> (?# Start of an ini file section)
\[(?<SectionName>[\w ]+)\] (?# Capture section name)
(?<SectionBody> (?# Capture section body. Section body can be empty)
(?<SectionLine> (?# Capture zero or more line(s) in the section body. Only key/value pair allowed.)
\s*(?<KeyValuePair>(?<Key>[\w\[\]]+)\s*=\s*(?<Value>[\w-]+))\s* (?# Capture key/value pair. Key and value are sub-captured separately)
)* (?# Zero or more line(s), previously mentionned in comments)
)
)
";
讨论 正则表达式构建为匹配非 INI 文件部分 (1) 或 INI 文件部分 (2)。
(1) 非 INI 文件节 这些节由节名和由 { 和 } 括起来的正文组成。 节名称 con 包含字母、数字或空格。 节体由零或多行组成。一行可以是键/值对 (key = value)、注释 (/* Here is a comment */) 或不带参数的函数调用 (my_function())。
(2) INI 文件部分 这些部分由 [ 和 ] 括起来的部分名称组成,后跟零个或多个键/值对。每一对都在一条线上。
关于c# - 正则表达式帮助 : My regex pattern will match invalid Dictionary,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5514195/