我有以下问题,我不确定如何处理。我想要一些帮助/提示来设计满足以下要求的高效算法
输入 输入的第一行包含一个整数 N,它是序列的长度。 接下来是 N 行,每行包含一个仅包含小写字符的字符串。 1<=N<=100000。 每个字符串的长度在 1 到 10 之间(含)。
输出 输出包含所有不同字符串的连续子系列的最小长度。
示例输入
6 随它去 米洪 米洪 近身 近身 顺其自然
示例输出
18
解释:最后4个连续的字符串包含所有具有最小长度(最少字符数)的唯一字符串
最佳答案
如果我没理解错的话,你想要的子系列是:
- 包含至少 1 个“letitbe”、“mihon”和“omi”实例
- 具有尽可能小的字符串长度总和
以下是如何有效地执行此操作,使用 C# 编写代码,在注释中解释算法:
static void Main(string[] args)
{
// Input
var elements = new List<string> { "letitbe", "mihon", "mihon", "omi", "omi", "letitbe" };
// Find distinct elements
var distinctElements = elements.Distinct().ToList();
// Create a dictionary that tells us how many copies of each element we have in the current subseries, initialize all values to 0
var copiesOfElementInCurrentSubseries = distinctElements.ToDictionary(key => key, value => 0);
// The sum of lengths of strings in the current subseries
// Our goal is to minimize this
var lengthOfCurrentSubseries = 0;
// How many distinct elements are covered by the current subseries
// The condition under which we minimize lengthOfCurrentSubseries is that numberOfElementsCoveredByCurrentSubseries equals distinctElements
var numberOfElementsCoveredByCurrentSubseries = 0;
// We remember the solution in these
var bestStartIndex = 0;
var bestLength = elements.Sum(e => e.Length);
var bestNum = elements.Count;
// Start with startIndex and endIndex at 0, increase endIndex until we cover all distinct elements
// The subseries from startIndex to endIndex (inclusive) is our current subseries
for (int startIndex = 0, endIndex = 0; endIndex < elements.Count; endIndex++)
{
// We add the element at endIndex to our current subseries:
// If we found an element that previously wasn't covered, increase the count of covered elements
// Note that we never decrease this, because once we find a solution that covers all elements, we never make a change which "loses" some element
if (copiesOfElementInCurrentSubseries[elements[endIndex]] == 0)
{
numberOfElementsCoveredByCurrentSubseries++;
}
// Increase the number of copies of the element we added
copiesOfElementInCurrentSubseries[elements[endIndex]]++;
// Increase the total length of subseries by this element's length
lengthOfCurrentSubseries += elements[endIndex].Length;
// Initially, we will just loop increasing endIndex until all elements are covered
// Once we are covering all elements, try to improve the solution
if (numberOfElementsCoveredByCurrentSubseries == distinctElements.Count)
{
// Move startIndex to the right as far as possible while still covering all elements
while (copiesOfElementInCurrentSubseries[elements[startIndex]] > 1)
{
lengthOfCurrentSubseries -= elements[startIndex].Length;
copiesOfElementInCurrentSubseries[elements[startIndex]]--;
startIndex++;
}
// If the new solution is better, remember it
if (lengthOfCurrentSubseries < bestLength)
{
bestLength = lengthOfCurrentSubseries;
bestStartIndex = startIndex;
bestNum = endIndex - startIndex + 1;
}
}
// Now we add another element by moving endIndex one place to the right, then try improving the solution by moving startIndex to the right, and we repeat this process...
}
Console.WriteLine(string.Join(" ", elements.Skip(bestStartIndex).Take(bestNum)));
}
请注意,即使这有嵌套循环,内部 while
循环在内部循环的所有传递组合中最多可以有 length of input
总步数,因为 startIndex
保持其值并始终向右移动。
如果您不熟悉 C# - Dictionary
基本上是一个哈希表 - 它可以根据键有效地查找值(只要键具有良好的哈希函数,字符串就可以)。
关于algorithm - 查找包含所有不同输入字符串的连续子系列的最小长度,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33822447/