嘿,大家好,我确实有以下代码,用于在大约 70 万个字母的文件中搜索子字符串,我相信它对于 ArrayList 效果很好,但对于 LinkedList 则需要很长时间才能完成。任何人都可以明白为什么需要那么长时间? =S
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.LinkedList;
import java.util.List;
import java.util.Scanner;
public class CountSubstrings {
private static int sumAL=0;
private static int sumLL=0;
private static List<Character> sAList= new ArrayList<Character>();
private static List<Character> sLList= new LinkedList<Character>();
private static List<Character> pattAL= new ArrayList<Character>();
private static List<Character> pattLL= new LinkedList<Character>();
private static int index=0;
private static double timer=0;
private static double Otimer=0;
/*
* Returns the lowest index at which substring pattern begins in text (or
* else -1).
*/
private static int findBrute(List<Character> text, List<Character> pattern, int position) {
int n = text.size();
int m = pattern.size();
for (int i = position; i <= n - m; i++) { // try every starting index
// within text
int k = 0; // k is index into pattern
while (k < m && (text.get(i + k) == pattern.get(k)))
{ // kth character of pattern matches
k++;
if (k == m )
{ index=i;
return i;} // substring text[i..i+m-1] is a match
}
}
return -1; // search failed
}
public static void main (String[] args)
{
Scanner sc1= new Scanner(System.in);
Scanner sc2= new Scanner(System.in);
System.out.print("Please enter the path for the input file: ");
String fileName= sc1.next();
System.out.print("Enter the pattern to look for: ");
String subString= sc2.next();
for(char c: subString.toCharArray())
{
pattAL.add(c);
pattLL.add(c);
}
System.out.println("current time "+System.currentTimeMillis()+" milliseconds");
try (BufferedReader OpenFile = new BufferedReader(new FileReader(fileName)))
{
// file is opened here and we can access everything in there.
String sSLine;
String content = new Scanner(new File(fileName)).useDelimiter("\\Z").next();
//System.out.println(content);
// find int answer line by line not complete
while ((sSLine = OpenFile.readLine()) != null) {
sSLine.replace('\n', ',');// making sure we add every word alone even when we encounter \n
for(char c: sSLine.toCharArray())
{
sAList.add(c);
sLList.add(c);
}
}
} catch (IOException e)
{
e.printStackTrace();
}
//Array List by pointer
//starting ARRAY LIST
Otimer=System.currentTimeMillis();
while(findBrute(sAList,pattAL,index)!=-1)
{
index=index+pattAL.size();
sumAL++;
}
timer=System.currentTimeMillis()-Otimer;
Otimer=System.currentTimeMillis();
index=0; // resetting the index OR we can make 2 other variables indexAL indexLL if magic numbers were so bad
System.out.println("Using ArrayList: "+sumAL+" matches, derived in "+timer+ " milliseconds");
while(findBrute(sLList,pattLL,index)!=-1)
{
System.out.println("index"+index+" char: "+sLList.get(index));
index=index+pattLL.size();
//if(sLList.get(index))
sumLL++;
System.out.println("index"+index+" char: "+sLList.get(index+1));
}
timer=System.currentTimeMillis()-Otimer;
System.out.println("Using Linked List: matches "+sumLL+" time, derived in "+timer+ " milliseconds");
}
}
最佳答案
我认为您需要了解 Linked list 如何工作。链接列表中的每个项目都引用列表中的下一个项目(在 Java 的情况下,还引用前一个项目)。因此,要获取链接列表中特定索引处的项目,必须从列表的任一端遍历所有项目,直到到达正确的索引。
相比之下,ArrayList
是基于数组构建的,因此允许非常快速地访问任意索引。
让我们看一下 LinkedList 的文档:
All of the operations perform as could be expected for a doubly-linked list. Operations that index into the list will traverse the list from the beginning or the end, whichever is closer to the specified index.
对于 ArrayList :
The size, isEmpty, get, set, iterator, and listIterator operations run in constant time.
在您的代码中,您可以在 findBrute
方法的循环中使用 get
方法。
... ↓ ↓
while (k < m && (text.get(i + k) == pattern.get(k)))
...
也在 main
方法的 while
循环中:
... ↓
System.out.println("index"+index+" char: "+sLList.get(index));
...
因此,由于链接列表的工作方式,与 ArrayList
相比,此代码使用链接列表将花费更多时间。
关于java - 在字符串中搜索子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28399454/