我有一本字典,其中的数据与此类似(字典将包含大约 10 万个条目):
[1] -> 5
[7] -> 50
[30] -> 3
[1000] -> 1
[100000] -> 35
我还有一个范围列表(大约 1000 个)
MyRanges
Range
LowerBoundInclusive -> 0
UpperBoundExclusive -> 10
Total
Range
LowerBoundInclusive -> 10
UpperBoundExclusive -> 50
Total
Range
LowerBoundInclusive -> 100
UpperBoundExclusive -> 1000
Total
Range
LowerBoundInclusive -> 1000
UpperBoundExclusive -> 10000
Total
Range (the "other" range)
LowerBoundInclusive -> null
UpperBoundExclusive -> null
Total
我需要计算字典中这些范围的总数。例如,范围 0-10 将是 55。这些范围可以变得非常大,所以我知道只在字典中搜索两个范围之间的每个值是没有意义的。我的直觉是我应该从字典中获取键列表,对其进行排序,然后遍历我的范围并进行某种搜索以找到范围内的所有键。这是正确的方法吗?有没有简单的方法可以做到这一点?
编辑: 感谢您的回复。真正聪明的东西。不过,我忘记了一个非常重要的注意事项。不保证范围是连续的,最终范围是所有不在其他范围内的东西。
最佳答案
你可以这样做:
// Associate each value with the range of its key
var lookup = dictionary.ToLookup(
kvp => ranges.FirstOrDefault(r => r.LowerBoundInclusive <= kvp.Key
&& r.UpperBoundExclusive > kvp.Key),
kvp => kvp.Value);
// Compute the total of values for each range
foreach (var r in ranges)
{
r.Total = lookup[r].Sum();
}
(注意:此解决方案不考虑您的编辑;它不处理非连续范围和“其他”范围)
但是,如果您有很多范围,则效率不是很高,因为它们会针对字典中的每个条目进行枚举...如果您先按键对字典进行排序,您可以获得更好的结果。
这是一个可能的实现:
// We're going to need finer control over the enumeration than foreach,
// so we manipulate the enumerator directly instead.
using (var dictEnumerator = dictionary.OrderBy(e => e.Key).GetEnumerator())
{
// No point in going any further if the dictionary is empty
if (dictEnumerator.MoveNext())
{
long othersTotal = 0; // total for items that don't fall in any range
// The ranges need to be in ascending order
// We want the "others" range at the end
foreach (var range in ranges.OrderBy(r => r.LowerBoundInclusive ?? int.MaxValue))
{
if (range.LowerBoundInclusive == null && range.UpperBoundExclusive == null)
{
// this is the "others" range: use the precalculated total
// of previous items that didn't fall in any other range
range.Total = othersTotal;
}
else
{
range.Total = 0;
}
int lower = range.LowerBoundInclusive ?? int.MinValue;
int upper = range.UpperBoundExclusive ?? int.MaxValue;
bool endOfDict = false;
var entry = dictEnumerator.Current;
// keys that are below the current range don't belong to any range
// (or they would have been included in the previous range)
while (!endOfDict && entry.Key < lower)
{
othersTotal += entry.Value;
endOfDict = !dictEnumerator.MoveNext();
if (!endOfDict)
entry = dictEnumerator.Current;
}
// while the key in the the range, we keep adding the values
while (!endOfDict && lower <= entry.Key && upper > entry.Key)
{
range.Total += entry.Value;
endOfDict = !dictEnumerator.MoveNext();
if (!endOfDict)
entry = dictEnumerator.Current;
}
if (endOfDict) // No more entries in the dictionary, no need to go further
break;
// the value of the current entry is now outside the range,
// so carry on to the next range
}
}
}
(已更新以考虑您的编辑;适用于非连续范围,并将不属于任何范围的项目添加到“其他”范围)
我没有运行任何基准测试,但它可能很快,因为字典和范围只被枚举一次。
显然,如果范围已经排序,则您不需要 ranges
上的 OrderBy
。
关于c# - 如何使用一系列键搜索 C# 字典?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24090762/