java - 删除java中的所有非字母数字字符

标签 java regex

这个程序显示每个单词在文本文件中出现的次数。发生了什么事,它也选择了像这样的角色?而且,我只想让它挑选字母。这只是结果的一部分{"1"=1, "Cheers"=1, "Fanny"=1, "I=1, "biscuits"=1, "chairz")=1, "cheeahz"=1, “crisps”=1,“jumpers”=1,?=20,工作:=1

import java.io.File;
import java.io.FileReader;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
import java.util.TreeMap;
import java.util.StringTokenizer;

public class Unigrammodel {

public static void main(String [] args){

    //Creating BufferedReader to accept the file name from the user
    BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

    String fileName = null;
    System.out.print("Please enter the file name with path: ");
    try{
        fileName = (String) br.readLine();

        //Creating the BufferedReader to read the file
        File textFile = new File(fileName);
        BufferedReader input = new BufferedReader(new FileReader(textFile));

        //Creating the Map to store the words and their occurrences
        TreeMap<String, Integer> frequencyMap = new TreeMap<String, Integer>();
        String currentLine = null;

        //Reading line by line from the text file
        while((currentLine = input.readLine()) != null){

            //Parsing the words from each line
            StringTokenizer parser = new StringTokenizer(currentLine); 
            while(parser.hasMoreTokens()){
                String currentWord = parser.nextToken();




                //remove all non-alphanumeric from this word

            currentWord.replaceAll(("[^A-Za-z0-9 ]"), "");

                Integer frequency = frequencyMap.get(currentWord); 
                if(frequency == null){
                    frequency = 0;                      
                }
                //Putting each word and its occurrence into Map 
                frequencyMap.put(currentWord, frequency + 1);
            }

        }

        //Displaying the Result

        System.out.println(frequencyMap +"\n");

    }catch(IOException ie){
        ie.printStackTrace();
        System.err.println("Your entered path is wrong");
    }       

}

}

最佳答案

字符串是不可变的,因此您需要将修改后的字符串分配给变量,然后再将其添加到映射中。 String wordCleaned= currentWord.replaceAll(("[^A-Za-z0-9 ]"), ""); ... FrequencyMap.put(wordCleaned, 频率 + 1);

关于java - 删除java中的所有非字母数字字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33131258/

相关文章:

java - 如何从组合框中的数字列表向 JPanel 添加按钮

java - 在 ArrayList 中添加 double 的问题

java - 如何在 Spring Boot 应用程序中设置 GOOGLE_APPLICATION_CREDENTIALS

Javascript正则表达式从字符串中删除字符和数字

regex - 正则表达式的否定?

python - 斜杠在python字符串和正则表达式中的使用

java - Android:WebView 是否受 Java 堆限制的影响?

java - JMapFrame 中的渲染质量

python-3.x - 从字符串中提取名称

php - preg_match_all 用于特殊字符 [?]