java - 在字符串中搜索子字符串

标签 java arrays linked-list substring

嘿,大家好,我确实有以下代码,用于在大约 70 万个字母的文件中搜索子字符串,我相信它对于 ArrayList 效果很好,但对于 LinkedList 则需要很长时间才能完成。任何人都可以明白为什么需要那么长时间? =S

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;

import java.util.ArrayList;
import java.util.Collections;
import java.util.LinkedList;
import java.util.List;
import java.util.Scanner;

public class CountSubstrings {
    private static int sumAL=0;
    private static int sumLL=0;
    private static List<Character> sAList= new ArrayList<Character>();
    private static List<Character> sLList= new LinkedList<Character>();
    private static List<Character> pattAL= new ArrayList<Character>();
    private static List<Character> pattLL= new LinkedList<Character>();
    private static int index=0;
    private static double timer=0;
    private static double Otimer=0;

    /*
     * Returns the lowest index at which substring pattern begins in text (or
     * else -1).
     */

    private static int findBrute(List<Character> text, List<Character> pattern,  int position) {
        int n = text.size();
        int m = pattern.size();
        for (int i = position; i <= n - m; i++) { // try every starting index 
                                 // within text
            int k = 0; // k is index into pattern
            while (k < m && (text.get(i + k) == pattern.get(k)))
            {   // kth character of pattern matches
                k++;
                if (k == m )
                {   index=i;
                    return i;} // substring text[i..i+m-1] is a match
                }
            }

        return -1; // search failed
    }

    public static void main (String[] args)
    {
        Scanner sc1= new Scanner(System.in);
        Scanner sc2= new Scanner(System.in);

        System.out.print("Please enter the path for the input file: ");
        String fileName= sc1.next();

        System.out.print("Enter the pattern to look for: ");
        String subString= sc2.next();

        for(char c: subString.toCharArray())
        {
            pattAL.add(c);
            pattLL.add(c);
        }

        System.out.println("current time "+System.currentTimeMillis()+" milliseconds");
        try (BufferedReader OpenFile = new BufferedReader(new FileReader(fileName)))
        {
            // file is opened here and we can access everything in there.
            String sSLine;
            String content = new Scanner(new File(fileName)).useDelimiter("\\Z").next();
            //System.out.println(content);

     // find int answer line by line not complete

            while ((sSLine = OpenFile.readLine()) != null) {
                sSLine.replace('\n', ',');// making sure we add every word alone even when we encounter \n
                for(char c: sSLine.toCharArray())
                {
                    sAList.add(c);
                    sLList.add(c);
                }
            }
        } catch (IOException e) 
        {
            e.printStackTrace();
        } 
        //Array List by pointer

        //starting ARRAY LIST
        Otimer=System.currentTimeMillis();
         while(findBrute(sAList,pattAL,index)!=-1)
         {
                index=index+pattAL.size();
                sumAL++;
         }

         timer=System.currentTimeMillis()-Otimer;
         Otimer=System.currentTimeMillis();
         index=0; // resetting the index  OR  we can make 2 other variables indexAL  indexLL  if magic numbers were so bad
         System.out.println("Using ArrayList: "+sumAL+" matches, derived in "+timer+ " milliseconds");
         while(findBrute(sLList,pattLL,index)!=-1)
         {
            System.out.println("index"+index+" char: "+sLList.get(index));

            index=index+pattLL.size();
            //if(sLList.get(index))
            sumLL++;
            System.out.println("index"+index+" char: "+sLList.get(index+1));
         }
         timer=System.currentTimeMillis()-Otimer;
    System.out.println("Using Linked List: matches "+sumLL+" time, derived in "+timer+ " milliseconds");
      }
}

最佳答案

我认为您需要了解 Linked list 如何工作。链接列表中的每个项目都引用列表中的下一个项目(在 Java 的情况下,还引用前一个项目)。因此,要获取链接列表中特定索引处的项目,必须从列表的任一端遍历所有项目,直到到达正确的索引。 相比之下,ArrayList 是基于数组构建的,因此允许非常快速地访问任意索引。

让我们看一下 LinkedList 的文档:

All of the operations perform as could be expected for a doubly-linked list. Operations that index into the list will traverse the list from the beginning or the end, whichever is closer to the specified index.

对于 ArrayList :

The size, isEmpty, get, set, iterator, and listIterator operations run in constant time.

在您的代码中,您可以在 findBrute 方法的循环中使用 get 方法。

...                    ↓                     ↓
while (k < m && (text.get(i + k) == pattern.get(k)))
...

也在 main 方法的 while 循环中:

...                                                ↓
System.out.println("index"+index+" char: "+sLList.get(index));
...

因此,由于链接列表的工作方式,与 ArrayList 相比,此代码使用链接列表将花费更多时间。

关于java - 在字符串中搜索子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28399454/

相关文章:

java - 在 Android 上获取当前位置 - 目前仅在第一次应用程序启动时崩溃 [广播错误]

java - Java 中的“ip 路由获取”

php - 带有变量数组的准备语句

java - java中快速排序的swap方法

c - 字符串数组末尾为NULL(C编程)

c - reverseList 函数不在队列中工作

java - JFreeChart X 轴标签超出图表区域

c++ - 递归链表差异

创建具有不同结构类型的链表

java - 通过套接字从 Android 客户端向 Java 服务器发送图像时数据丢失