我有以下文件,我正在使用迭代器 block 来解析文件中某些重复出现的节点/部分。我最初使用正则表达式来解析整个文件,但是当节点中不存在某些字段时,它不会匹配。所以我正在尝试使用 yield 模式。文件格式如下,我正在使用的代码。我想要从文件中得到的只是将复制节点作为一个单独的部分,这样我就可以使用键字符串获取其中的字段并存储在对象集合中。我可以从第一个复制发生的地方开始解析,但无法在复制节点结束的地方结束它。
文件格式:
X_HEADER
{
DATA_MANAGEMENT_FIELD_2 NA
DATA_MANAGEMENT_FIELD_3 NA
DATA_MANAGEMENT_FIELD_4 NA
SYSTEM_SOFTWARE_VERSION NA
}
Y_HEADER
{
DATA_MANAGEMENT_FIELD_2 NA
DATA_MANAGEMENT_FIELD_3 NA
DATA_MANAGEMENT_FIELD_4 NA
SYSTEM_SOFTWARE_VERSION NA
}
COMPLETION
{
NUMBER 877
VERSION 4
CALIBRATION_VERSION 1
CONFIGURATION_ID 877
}
REPLICATE
{
REPLICATE_ID 1985
ASSAY_NUMBER 656
ASSAY_VERSION 4
ASSAY_STATUS Research
DILUTION_ID 1
}
REPLICATE
{
REPLICATE_ID 1985
ASSAY_NUMBER 656
ASSAY_VERSION 4
ASSAY_STATUS Research
}
代码:
static IEnumerable<IDictionary<string, string>> ReadParts(string path)
{
using (var reader = File.OpenText(path))
{
var current = new Dictionary<string, string>();
string line;
while ((line = reader.ReadLine()) != null)
{
if (string.IsNullOrWhiteSpace(line)) continue;
if (line.StartsWith("REPLICATE"))
{
yield return current;
current = new Dictionary<string, string>();
}
else
{
var parts = line.Split('\t');
}
if (current.Count > 0) yield return current;
}
}
}
public static void parseFile(string fileName)
{
foreach (var part in ReadParts(fileName))
{
//part["fIELD1"] will retireve certain values from the REPLICATE PART HERE
}
}
最佳答案
嗯,这听起来好像您只需要在获得右大括号时“关闭”一个部分,并且此时仅yield return
。例如:
static IEnumerable<IDictionary<string, string>> ReadParts(string path)
{
using (var reader = File.OpenText(path))
{
string currentName = null;
IDictionary<string, string> currentMap = null;
while ((line = reader.ReadLine()) != null)
{
if (string.IsNullOrWhiteSpace(line))
{
continue;
}
if (line == "{")
{
if (currentName == null || currentMap != null)
{
throw new BadDataException("Open brace at wrong place");
}
currentMap = new Dictionary<string, string>();
}
else if (line == "}")
{
if (currentName == null || currentMap == null)
{
throw new BadDataException("Closing brace at wrong place");
}
// Isolate the "REPLICATE-only" requirement to a single
// line - if you ever need other bits, you can change this.
if (currentName == "REPLICATE")
{
yield return currentMap;
}
currentName = null;
currentMap = null;
}
else if (!line.StartsWith("\t"))
{
if (currentName != null || currentMap != null)
{
throw new BadDataException("Section name at wrong place");
}
currentName = line;
}
else
{
if (currentName == null || currentMap == null)
{
throw new BadDataException("Name/value pair at wrong place");
}
var parts = line.Substring(1).Split('\t');
if (parts.Length != 2)
{
throw new BadDataException("Invalid name/value pair");
}
currentMap[parts[0]] = parts[1];
}
}
}
}
老实说,这是一个相当可怕的功能。我怀疑我会把它放在它自己的类中(可能是一个嵌套的类)来存储状态,并使每个处理程序都有自己的方法。哎呀,这实际上是状态模式有意义的情况:)
关于c# - yield 模式,状态机流程,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11108491/