我正在尝试创建我自己的 Java Trie 版本,以便拥有一个 Java Trie,并获得制作它所需的知识,但这个项目让我感到困惑。我这里有一个非常基本的损坏的特里树。
我向 Trie 添加 3 个单词(使用单词的第一个字符作为键,值是附加的 TrieNode)。如果您注意到,我在 TrieNode 类内有 print 语句,用于在程序运行时检查 isWord 的值。我希望单词中的最后一个字符将 isWord 变量设置为 True。从而识别单词的结尾(稍后我将用它来获取整个单词)。
当我第一次放入三个单词时,输出会在每个单词进入 Trie 时打印每个单词的每个字母,并正确识别哪些字符是单词结尾。
但是,如果在输入单词后立即遍历 Trie 并重新打印 Trie 中的所有字符及其 isWord 状态,那么“Hello”中的“e”现在突然被识别为单词结尾?
我已经为此倾注了几个小时,但我只是不明白为什么会发生这种情况。下面是工作代码:
package testcode;
import java.util.*;
public class TestCode {
public static Trie t;
public static void main (String[] args){
t = new Trie();
t.addWord("hello");
t.addWord("hi");
t.addWord("soup");
//at this point the output correctly identifies word endings.
t.findWords();
/* but when iterating through the hash map it becomes evident that
* when entering the word 'hi' the 'e' in 'hello' had its isWord variable
* changed to true. I followed the logic and I do not see how or why this
* is happening.
*/
}
}
//This Trie class handles the root trie, and Trie commands.
class Trie{
private TrieNode root;
public Trie(){
root = new TrieNode();
}
public void addWord(String word){
root.addWord(word.toLowerCase());
}
public void findWords(){
root.findWords();
}
}
//Trie Node handles the nodes and words within the trie
class TrieNode{
private TrieNode parent;
private boolean isWord;
private boolean hasChildren;
private char character;
private Map<Character, TrieNode> children = new HashMap<>();
public TrieNode(){
hasChildren = false;
isWord = false;
}
public TrieNode(String word){
this();
addWord(word);
}
public void addWord(String word){
char firstChar = word.charAt(0);
if (children.get(firstChar) == null){
if(word.length() > 1){
hasChildren = true;
children.put(firstChar, new TrieNode(word.substring(1)));
children.get(firstChar).parent = this;
System.out.print(firstChar + "--");
System.out.println(isWord);
}
else{
children.put(firstChar, new TrieNode());
if(character == 'e'){
System.out.println("shits about to go down");
}
isWord = true;
System.out.print(firstChar + "--");
System.out.println(isWord);
}
children.get(firstChar).character = firstChar;
}
else {
children.get(firstChar).addWord(word.substring(1));
}
}
public void findWords(){
for(Character key : children.keySet()){
children.get(key).findWords();
System.out.println(children.get(key).character + " -- " + isWord);
}
}
}
此代码生成以下输出:
o--true
l--false
l--false
e--false
h--false
i--true
p--true
u--false
o--false
s--false
p -- true
u -- false
o -- false
s -- false
o -- true
l -- false
l -- false
e -- true //notice the e here is now suddenly a word ending with isWord = true
i -- true
h -- false
最佳答案
存在一系列可能的问题..父/子混淆,在父节点处理叶案例,无论是在构建还是打印输出等中。
我注意到在您旧的“findWords”代码中,您正在打印子字符,但父“isWord”标志。构建特里树在“子节点存在”和“创建子节点路径”之间存在不希望的分歧——这样“isWord”只能在新路径上标记,而不能在现有路径上标记。构建特里树似乎还在父节点而不是叶节点上设置“isWord”。
一般来说,由嵌套 IF 情况组成的意大利面条式代码很可能不可靠。代码应该尽可能通用——将其保留在方法的主流程中,除非它确实属于 IF 内部。
这是干净且正确的代码:
class TrieNode{
private TrieNode parent;
private boolean isWord;
private boolean hasChildren;
private char character;
private Map<Character, TrieNode> children = new HashMap<>();
public TrieNode(){
this.hasChildren = false;
this.isWord = false;
}
public TrieNode (char ch) {
this.character = ch;
this.hasChildren = false;
this.isWord = false;
}
public void addWord (String word){
if (word.length() == 0) {
this.isWord = true;
System.out.println( character + " -- " + isWord);
return;
}
// represent the Child Node;
// --
char firstChar = word.charAt(0);
TrieNode child = children.get( firstChar);
if (child == null){
child = new TrieNode( firstChar);
children.put( firstChar, child);
child.parent = this;
hasChildren = true;
}
// add Remaining Word;
// -- call for 1-length words, as 0-length at Child sets 'IsWord'!
child.addWord( word.substring(1));
// print building here.
System.out.println( character + " -- " + isWord);
}
public void findWords(){
for(Character key : children.keySet()){
children.get(key).findWords();
}
System.out.println( character + " -- " + isWord);
}
}
关于java - 使用 Java Trie 无法识别单词结尾。递归某处失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16583294/