我正在尝试从大量非常长的字符串中提取主要单词以简化显示,因此...
假设我们有一个字符串数组输出:
Something One
Something [ABC] Two
Something [ABC] Three
Something Four Section 1
Something Four Section 2
Something Five
如何删除 非常量
重复的单词,如 Something
和 [ABC]
以便它只留下每个字符串的唯一标识符像 One
Two
Three
并输出这个列表:
One
Two
Three
Four Section 1
Four Section 2
Five
知道:
一个副本是;在列表中重复多次的任何单词
{"One", "Two", "Three", ..} 如前所述,不是常量,仅用于示例,可以更改为其他任何内容,例如 {"Alpha""Bravo "、"Charlie"} 或 {"Nu"、"Xi"、"Pi"},只要它们不重复即可。
如果某个词存在(在这种情况下)“Section 1”,则前面的词会保留在它前面,这样“Something Four Section 1”就会变成“Four Section 1”
最佳答案
此解决方案假设您除了某些单词(例如 Section 1"
)之外一无所知(就像 John Snow)。它适用于任意字符串输入。它有 2 个要点。
1) FindRepeatedWords
是一种填充UniqueWords
哈希集和Repeats
哈希集的方法。 UniqueWords,顾名思义就是列表中每个唯一的词,Repeats就是重复的词。
2)CleanUpWordsAndDoNotChangeList
是执行您想要的操作的主要方法。它决定根据某些单词删除单词。
namespace StackOverfFLow {
using System;
using System.Collections.Generic;
using System.Linq;
internal class Program {
private static readonly HashSet<string> UniqueWords = new HashSet<string>();
private static readonly HashSet<string> Repeats = new HashSet<string>();
private static readonly List<string> CertainWords = new List<string> { "Section 1", "Section 2" };
private static readonly List<string> Words = new List<string> { "Something One", "Something [ABC] Two", "Something [ABC] Three", "Something Four Section 1", "Something Four Section 2", "Something Five" };
private static void Main(string[] args) {
FindRepeatedWords();
var result = CleanUpWordsAndDoNotChangeList();
result.ForEach(Console.WriteLine);
Console.ReadKey();
}
/// <summary>
/// Cleans Up Words And Des oNot Change List.
/// </summary>
/// <returns></returns>
private static List<string> CleanUpWordsAndDoNotChangeList() {
var newList = new List<string>();
foreach(var t in Words) {
var sp = SeperateStringByString(t);
for(var index = 0; index < sp.Count; index++) {
if(Repeats.Contains(sp[index]) != true) { continue; }
var fixedTocheck = sp.ElementAtOrDefault(index + 1);
if(fixedTocheck == null || CertainWords.Contains(fixedTocheck)) { continue; }
sp.RemoveAt(index);
index = index - 1;
}
newList.Add(string.Join(" ", sp));
}
return newList;
}
/// <summary>
/// Finds Unique and Repeated Words.
/// </summary>
private static void FindRepeatedWords() {
foreach(var eachWord in Words) {
foreach(var element in SeperateStringByString(eachWord)) {
if(UniqueWords.Add(element) == false) { Repeats.Add(element); };
}
}
}
/// <summary>
/// Seperates a string by another string
/// </summary>
/// <param name="source">Source string</param>
/// <returns></returns>
private static List<string> SeperateStringByString(string source) {
var seperatedStringByString = new List<string>();
foreach(var certainWord in CertainWords) {
var indexOf = source.IndexOf(certainWord);
if(indexOf <= -1) { continue; }
var a = source.Substring(0, indexOf).Trim().Split(' ');
seperatedStringByString.AddRange(a);
seperatedStringByString.Add(certainWord);
}
if(seperatedStringByString.Count < 1) { seperatedStringByString.AddRange(source.Split(' ')); }
return seperatedStringByString;
}
}
}
关于c# - 从数组\列表中的字符串中删除非常量重复的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44096055/