例如,如果我有以下 2 个数组:
string[] userSelect = new string[] {"the", "quick", "brown", "dog", "jumps", "over"};
string[] original = new string[] {"the", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog"};
我正在尝试将 userSelect 数组与原始数组进行比较,并根据索引获取所有连续的匹配项。 userSelect 数组将始终由原始数组中的字符串组成。所以输出将如下所示:
int[] match0 = new int[] {0, 1, 2}; // indices for "the quick brown"
int[] match2 = new int[] {4, 5}; // indices for "jumps over"
int[] match1 = new int[] {3}; // index for "dog"
userSelect 数组长度永远不会超过原始数组长度,但它可以更短并且单词可以按任意顺序排列。我该怎么做呢?
最佳答案
这是我想出来的
var matches =
(from l in userSelect.Select((s, i) => new { s, i })
join r in original.Select((s, i) => new { s, i })
on l.s equals r.s
group l by r.i - l.i into g
from m in g.Select((l, j) => new { l.i, j = l.i - j, k = g.Key })
group m by new { m.j, m.k } into h
select h.Select(t => t.i).ToArray())
.ToArray();
这将输出
matches[0] // { 0, 1, 2 } the quick brown
matches[1] // { 4, 5 } jumps over
matches[2] // { 0 } the
matches[3] // { 3 } dog
使用输入 {"the", "quick", "brown", "the", "lazy", "dog"}
产生:
matches[0] // { 0, 1, 2 } the quick brown
matches[1] // { 0 } the
matches[2] // { 3 } the
matches[3] // { 3, 4, 5 } the lazy dog
请注意,对 ToArray
的调用是可选的。如果您实际上不需要数组中的结果,您可以将其省略并节省一点处理时间。
要过滤掉与其他较大序列完全包含的任何序列,您可以运行此代码(注意修改后的查询中的 orderby
):
var matches =
(from l in userSelect.Select((s, i) => new { s, i })
join r in original.Select((s, i) => new { s, i })
on l.s equals r.s
group l by r.i - l.i into g
from m in g.Select((l, j) => new { l.i, j = l.i - j, k = g.Key })
group m by new { m.j, m.k } into h
orderby h.Count() descending
select h.Select(t => t.i).ToArray());
int take = 0;
var filtered = matches.Where(m => !matches.Take(take++)
.Any(n => m.All(i => n.Contains(i))))
.ToArray();
关于c# - 如何比较 2 个字符串数组并找到所有连续匹配项并保存索引?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17074935/