我正在尝试编写一个简单的程序来比较不同文件夹中的文件。我目前正在使用 LINQ to Objects 来解析文件夹,并希望在我的结果集中包含从字符串中提取的信息。
这是我目前所拥有的:
FileInfo[] fileList = new DirectoryInfo(@"G:\Norton Backups").GetFiles();
var results = from file in fileList
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length };
foreach (var x in results)
Console.WriteLine(x.Name);
这会产生:
AWS025.sv2i
AWS025_C_Drive038.v2i
AWS025_C_Drive038_i001.iv2i
AWS025_C_Drive038_i002.iv2i
AWS025_C_Drive038_i003.iv2i
AWS025_C_Drive038_i004.iv2i
AWS025_C_Drive038_i005.iv2i
...
我想修改 LINQ 查询以便:
- 它只包含实际的“备份”文件(您可以根据上面示例中的
_C_Drive038
来判断备份文件,尽管038
和驱动器盘符可能会更改). - 如果文件是“主”备份文件(即文件名末尾没有
_i0XX
),我想包含一个字段。 - 我想包含文件的“图像编号”(例如,在本例中为
038
)。 - 如果它是基础文件的增量,我想包括增量编号(例如
001
将是增量编号)
我相信查询的基本布局如下所示,但我不确定如何最好地完成它(我对如何完成其中的一些有一些想法,但我有兴趣听说过其他人可能会怎么做):
var results = from file in fileList
let IsMainBackup = \\ ??
let ImageNumber = \\ ??
let IncrementNumber = \\ ??
where \\ it is a backup file.
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length,
IsMainBackup, ImageNumber, IncrementNumber };
在查找 ImageNumber
和 IncrementNumber
时,我想假设这些数据的位置并不总是固定的,也就是说,我想知道解析它的好方法(如果这需要 RegEx,请解释我如何使用它)。
注意:我过去在解析文本方面的大部分经验都涉及使用基于位置的字符串函数,例如 LEFT
、RIGHT
或 MID
。如果有更好的方法,我宁愿不求助于那些。
最佳答案
使用正则表达式:
Regex regex = new Regex(@"^.*(?<Backup>_\w_Drive(?<ImageNumber>\d+)(?<Increment>_i(?<IncrementNumber>\d+))?)\.[^.]+$");
var results = from file in fileList
let match = regex.Match(file.Name)
let IsMainBackup = !match.Groups["Increment"].Success
let ImageNumber = match.Groups["ImageNumber"].Value
let IncrementNumber = match.Groups["IncrementNumber"].Value
where match.Groups["Backup"].Success
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length,
IsMainBackup, ImageNumber, IncrementNumber };
下面是正则表达式的描述:
^ Start of string.
.* Allow anything at the start.
(?<Backup>...) Match a backup description (explained below).
\. Match a literal period.
[^.]+$ Match the extension (anything except periods).
$ End of string.
备份是:
_\w_Drive A literal underscore, any letter, another underscore, then the string "Drive".
(?<ImageNumber>\d+) At least one digit, saved as ImageNumber.
(?<Increment>...)? An optional increment description.
增量为:
_i A literal underscore, then the letter i.
(?<IncrementNumber>\d+) At least one digit, saved as IncrementNumber.
这是我使用的测试代码:
using System;
using System.IO;
using System.Text.RegularExpressions;
using System.Linq;
class Program
{
static void Main(string[] args)
{
FileInfo[] fileList = new FileInfo[] {
new FileInfo("AWS025.sv2i"),
new FileInfo("AWS025_C_Drive038.v2i"),
new FileInfo("AWS025_C_Drive038_i001.iv2i"),
new FileInfo("AWS025_C_Drive038_i002.iv2i"),
new FileInfo("AWS025_C_Drive038_i003.iv2i"),
new FileInfo("AWS025_C_Drive038_i004.iv2i"),
new FileInfo("AWS025_C_Drive038_i005.iv2i")
};
Regex regex = new Regex(@"^.*(?<Backup>_\w_Drive(?<ImageNumber>\d+)(?<Increment>_i(?<IncrementNumber>\d+))?)\.[^.]+$");
var results = from file in fileList
let match = regex.Match(file.Name)
let IsMainBackup = !match.Groups["Increment"].Success
let ImageNumber = match.Groups["ImageNumber"].Value
let IncrementNumber = match.Groups["IncrementNumber"].Value
where match.Groups["Backup"].Success
orderby file.CreationTime
select new { file.Name, file.CreationTime,
IsMainBackup, ImageNumber, IncrementNumber };
foreach (var x in results)
{
Console.WriteLine("Name: {0}, Main: {1}, Image: {2}, Increment: {3}",
x.Name, x.IsMainBackup, x.ImageNumber, x.IncrementNumber);
}
}
}
这是我得到的输出:
Name: AWS025_C_Drive038.v2i, Main: True, Image: 038, Increment:
Name: AWS025_C_Drive038_i001.iv2i, Main: False, Image: 038, Increment: 001
Name: AWS025_C_Drive038_i002.iv2i, Main: False, Image: 038, Increment: 002
Name: AWS025_C_Drive038_i003.iv2i, Main: False, Image: 038, Increment: 003
Name: AWS025_C_Drive038_i004.iv2i, Main: False, Image: 038, Increment: 004
Name: AWS025_C_Drive038_i005.iv2i, Main: False, Image: 038, Increment: 005
关于c# - 我如何使用 LINQ 和字符串解析来完成这个示例?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1955541/