c++ - 我如何找到部分单词匹配/查找 C++

标签 c++ string set partial

我在制作一个类似拼字游戏的游戏时遇到了问题。我的目标是获得 5 个随机字母,这些字母可能至少与我制作的字典中的一个单词匹配,这样游戏就可以开始了。我已经这样做了,但由于这 5 个字母是随机的,因此它与字典中的某个单词匹配的可能性很小。我已经完成了测试运行并收到了随机字母,但每次都不匹配(但是它是硬编码的)。我想如果我能找到部分匹配,我可以保留构成单词一部分的字母,并为不构成单词的字母获取不同的字母。问题是,我不知道该怎么做。

这是我的五个随机字母的代码:

void fiveLetters()
{
    srand(time(NULL));

    for (int i =0; i<=4; i++) // display 5 random letters
    {
        int n = rand()%26;
        char letter = (char)(n+97);
        letters[i] = letter;
        cout << letters[i]<< " ";
    }
    letters[5] = '\0'; // last value in char array null value so I can convert to string
    attempt = letters; // the 4 letters

    checkWords();
}

检查它是否与我使用的单词匹配:

void checkWords()
{
    if (find(words.begin(),words.end(), attempt) != words.end()) // if matches word in set
    {
        cout << " Words found" << endl;
    }
    else // if it doesn't match
    {
        cout << "cannot find word " << endl;
        attempt.clear();
        fiveLetters();
    }
}

字典是一个包含很多单词的文本文件,但是由于我只是想在实现它之前让事情正常运行,所以我使用了 5-10 个单词并将它们放在 set<string> 中。 。抱歉,阅读时间较长,感谢您的帮助!

最佳答案

此示例演示如何使用 Trie 递归搜索加载到其中的单词,使用搜索条件,例如可用字母的计数、最小单词长度和“缺失”字母的最大数量(使用但不包含在其中的字母)一封“可用”信。

这个 Trie 是一棵 26 叉树,因此每个节点有 26 个子节点,每个子节点对应字母表中的一个字母。使用单词中的第一个字母选择根节点的子节点,第二个字母选择该子节点的子节点之一,依此类推。节点包含一个 bool 值,指示单词以该节点结尾。然而,这样的节点不是“叶子”,因为终止的单词可能是较长单词的前缀(节点终止“ball”,但仍然有一个“balloon”的子分支)。

Trie 以及递归搜索对于您的特定任务来说速度惊人。 (顺便说一下,递归搜索不使用递归函数,它将每个级别的上下文存储在基于 vector 的堆栈上)。

#include <iostream>
#include <string>
#include <unordered_map>
#include <vector>
#include <fstream>
#include <random>
#include <algorithm>

class Trie {
    // branch_count defines the number of characters which are combined to make words
    // 26 is the number of letters, interpreted case-insensitively
    static const int branch_count = 26;
    // index_map takes a character and returns a number between 0 and branch_count-1
    static int index_map(char c) {
        if((c >= 'a') & (c <= 'z')) return c - 'a';
        if((c >= 'A') & (c <= 'Z')) return c - 'A';
        return -1;
    }
    // index_unmap takes an index, between 0 and branch_count-1, and returns
    // the canonical character representation for that index
    static char index_unmap(int index) {
        return char('a' + index);
    }
    // verify_word returns true if all characters in the string map to an index
    // returns false if at least one character is not recognized
    static bool verify_word(const std::string& word) {
        for(auto&& c : word) {
            if(index_map(c) == -1) return false;
        }
        return true;
    }
    // Node is the Trie's branch_count-ary tree node class
    struct Node {
        Node* child_nodes[branch_count];
        bool terminates_word;
        Node() : terminates_word(false) {
            for(auto& p : child_nodes) { p = nullptr; }
        }
    };
    // make_lower(str) changes upper-case letters in str to lower-case (in-place)
    static void make_lower(std::string& str) {
        for(auto& c : str) {
            if((c >= 'A') & (c <= 'Z')) {
                c += 'a' - 'A';
    }   }   }
    // is_space_char(x) returns true when c is some kind of 
    // unprintable or blank, but nonzero, character
    static bool is_space_char(char c) { return (c > 0) & (c <= ' '); }

    // trim(str) removes whitespace from the left and right sides
    // of str (str is modified in-place)
    static void trim(std::string& str) {
        const auto len = str.length();
        if(!len) return;
        auto i = len-1;
        if(is_space_char(str[i])) {
            for(--i; i+1; --i) {
                if(!is_space_char(str[i])) {
                    str.resize(i+1);
                    break;
        }   }   }
        if(!(i+1)) {
            str.clear();
            return;
        }
        i=0;
        if(is_space_char(str[i])) {
            for(++i;; ++i) {
                if(!is_space_char(str[i])) {
                    str = str.substr(i);
                    return;
    }   }   }   }
    Node *root;
    int node_count;
    int word_count;

public:
    // Trie::AddWord(word) stores a string in the Trie
    void AddWord(std::string word) {
        if(word.empty()) return;
        make_lower(word);
        if(!verify_word(word)) return;
        Node *p = root;
        for(const auto c : word) {
            const int child_index = index_map(c);
            if(child_index == -1) {
                // verify_word(word) should have caught this.
                // Well-behaved, but might use excess memory.
                return;
            }
            Node *pchild = p->child_nodes[child_index];
            if(!pchild) {
                p->child_nodes[child_index] = pchild = new Node;
                ++node_count;
            }
            p = pchild;
        }
        if(!p->terminates_word) {
            p->terminates_word = true;
            ++word_count;
    }   }

    // LoadWords(input_stream) loads all line-delimited words from
    // the stream into the Trie
    int LoadWords(std::istream& stream_in) {
        const int start_count = word_count;
        std::string line;
        while(std::getline(stream_in, line)) {
            trim(line);
            AddWord(line);
        }
        return word_count - start_count;
    }
    // LoadWords(filename) loads all line-delimited words from
    // the file at the given path
    int LoadWords(const std::string& file_path) {
        std::ifstream stream_in(file_path.c_str());
        return LoadWords(stream_in);
    }
    // WordCount() returns the number of words loaded so far
    int WordCount() const { return word_count; }

    // select_type is a helper for specifying iterator behavior
    template <bool select_A, typename A, typename B>
    struct select_type { typedef A type; };
    template <typename A, typename B>
    struct select_type<false, A, B> { typedef B type; };
    template <bool select_A, typename A, typename B>
    using select_type_t = typename select_type<select_A, A, B>::type;

    // The iterator class is used for begin() and end(), as a minimal
    // implementation compatible with range-based for,
    // as well as by the destructor when destroying the
    // tree's Node objects.
    template <bool is_const=true, bool delete_iter=false>
    class iterator {
        friend class Trie;
        typedef select_type_t<is_const, const Node*, Node*> pnode_t;
        struct context {
            pnode_t node;
            int     child_index;
        };
        std::vector<context> stack;

        pnode_t advance() {
            for(;;) {
                if(stack.empty()) return nullptr;
                pnode_t p = stack.back().node;
                int &child_index = stack.back().child_index;
                while(++child_index < branch_count) {
                    pnode_t pchild = p->child_nodes[child_index];
                    if(pchild) {
                        stack.push_back({pchild, -1});
                        if(!delete_iter && pchild->terminates_word) return nullptr;
                        break;
                }   }
                if(stack.back().child_index == branch_count) {
                    stack.pop_back();
                    if(delete_iter) return p;
        }   }   }
    public:
        iterator(pnode_t root) {
            stack.push_back({root, -1});
            if(!delete_iter) advance();
        }
        iterator() {}
        std::string operator * () const {
            if(stack.empty()) return std::string();
            std::string word;
            for(int i=0; stack[i].child_index != -1; ++i) {
                word += index_unmap(stack[i].child_index);
            }
            return word;
        }
        iterator& operator ++ () {
            advance();
            return *this;
        }
        bool operator != (const iterator& iter) const {
            if(stack.size() != iter.stack.size()) return true;
            const int size = static_cast<int>(stack.size());
            for(int i=0; i<size; ++i) {
                if(stack[i].node != iter.stack[i].node) return true;
            }
            return false;
        }
    };
    // ctor
    Trie() : root(new Node), node_count(1), word_count(0) {}
    // dtor
    ~Trie() {
        iterator<false, true> iter(root);
        int count = 0;
        while(auto pn = iter.advance()) {
            delete pn;
            ++count;
        }
        //std::cout << "Final word count: " << word_count << '\n';
        //std::cout << count << " of " << node_count << " Node objects deleted\n";
    }

    // const_iterator defined from iterator with template parameter 
    // for selecting const Node pointers
    typedef iterator<true> const_iterator;
    const_iterator  begin() const { return const_iterator(root); }
    const_iterator  end()   const { return const_iterator(); }

    // FindWords:
    //         * takes an unordered map with char keys and int values
    //           (counts[ch] = <how many ch may be used>)
    //         * takes a "max_missing" count (number of substituted characters which 
    //           may be used when none remain in "counts")
    //         * takes a "min_length", the minimum length a word 
    //           must have to be added to the results
    std::vector<std::string> FindWords(const std::unordered_map<char, int>& counts, int max_missing=0, int min_length=0) {
        struct context {
            const Node*                   node;
            int                           child_index;
            std::unordered_map<char, int> counts;
            int                           missing_count;
            bool                          missing_letter;

            context(const Node* node, const std::unordered_map<char, int>& counts, int missing_count) : 
                node(node), 
                child_index(-1),
                counts(counts),
                missing_count(missing_count),
                missing_letter(false)
            {}
        };

        std::vector<context> stack;
        stack.push_back(context(root, counts, 0));

        std::vector<std::string> match_list; // results are added to this

        // This walks the tree just like the iterator's advance() function
        // however, this function's context includes more info, like
        // each level's available letter counts and whether a letter
        // was used by taking one of the available "missing" letters.
        while(!stack.empty()) {
            context& ctx = stack.back();
            while(++ctx.child_index < branch_count) {
                const Node* pchild = ctx.node->child_nodes[ctx.child_index];
                if(!pchild) continue;
                const char c = index_unmap(ctx.child_index);
                if(ctx.counts[c] > 0) {
                    ctx.missing_letter = false;
                    context child_ctx(pchild, ctx.counts, ctx.missing_count);
                    --child_ctx.counts[c];
                    stack.push_back(child_ctx); // ctx made invalid here
                    break;
                }
                else if(ctx.missing_count < max_missing) {
                    ctx.missing_letter = true;
                    context child_ctx(pchild, ctx.counts, ctx.missing_count + 1);
                    stack.push_back(child_ctx); // ctx made invalid here
                    break;
            }   }
            context& fresh_ctx = stack.back();
            if(fresh_ctx.child_index == branch_count) {
                stack.pop_back();
                continue;
            }
            // After a new level is pushed on the stack, check if the last node
            // completes a word from the tree, then check various conditions
            // required for adding it to the results.
            if(static_cast<int>(stack.size()) > min_length && fresh_ctx.node->terminates_word) {
                std::string word;
                for(const auto& entry : stack) {
                    if(entry.child_index != -1) {
                        char c = index_unmap(entry.child_index);
                        if(entry.missing_letter) {
                            // modify the character to show it was substituted
                            if(c >= 'a' && c <= 'z') {
                                // capitalize missing lower-case letters
                                word += c + 'A' - 'a';
                            } else {
                                // put funky square brackets around other types of missing characters
                                word += '[';
                                word += c;
                                word += ']';
                            }
                        } else {
                            word += c;
                }   }   }
                match_list.push_back(word);
        }   }
        return match_list;
    }

    // FindWords(letters, max_missing, min_length) creates a "counts" map 
    // from the "letters" string and uses it to call FindWords(counts...)
    std::vector<std::string> FindWords(std::string letters, int max_missing=0, int min_length=0) {
        std::unordered_map<char, int> counts;
        for(auto c : letters) {
            switch(c) {
                case '?': ++max_missing; break;  // '?' is a wildcard (blank tile)
                default:  ++counts[c];   break;
        }   }
        return FindWords(counts, max_missing, min_length);
    }

    // DumpAllWords dumps all words contained in the Trie to cout (in alphabetical order)
    void DumpAllWords() {
        for(auto&& s : *this) {
            std::cout << s << '\n';
    }   }
};

class DrawPool {
    std::mt19937 rng;
    std::string letters;
    int last_count = 1;
    struct arg_is_int { char i[4]; }; 
    static arg_is_int  argtype(int);
    static char        argtype(char);
    void Add() {}
    template <typename T, typename ...Args> 
    void Add(T arg, Args ...args) {
        if(sizeof(argtype(arg)) == sizeof(arg_is_int)) {
            last_count = arg;
        } else {
            letters += std::string(last_count, arg);
        }
        Add(args...);
    } 

public:
    void Shuffle() {
        letters.clear();
        Add(2, '?',
            12,'e', 9, 'a', 'i', 8, 'o', 6, 'n', 'r', 't', 
            4, 'l', 's', 'u', 'd', 3, 'g', 
            2, 'b', 'c', 'm', 'p', 'f', 'h', 'v', 'w', 'y', 
            1, 'k', 'j', 'x', 'q', 'z');
        std::shuffle(letters.begin(), letters.end(), rng);
    }
    int Count() const { return static_cast<int>(letters.length()); }
    std::string Get(int count) {
        if(count > Count()) count = Count();
        const std::string draw = letters.substr(0, count);
        letters.erase(0, count);
        return draw;
    }

    DrawPool() {
        std::random_device rd;
        std::seed_seq seed = {rd(), rd(), rd(), rd(), rd()};
        rng.seed(seed);
        Shuffle();
    }
};

int main() {
    Trie trie;

    // Call trie.LoadWords(filename) with each filename in the list
    // The names here are files from the SCOWL word lists.
    // These may be replaced with your own file name(s).
    for(auto s : {
        "english-words.10",
        "english-words.20",
        "english-words.35",
        "english-words.40",
        "english-words.50",
        "english-words.55",
        "english-words.60",
        "english-words.70",
        "english-words.80",
        "english-words.95"
    }) {
        int count = trie.LoadWords(s);
        std::cout << "Loaded " << count << " new words from " << s << '\n';
    }
    std::cout << "\nLoaded a total of " << trie.WordCount() << " words\n";

    //trie.DumpAllWords(); // send all words to cout

    // match a set of 7 randomly-drawn letters
    // draw one more letter each time no matches found
    DrawPool draw_pool;
    std::string rack;
    std::vector<std::string> word_list;
    do {
        rack += draw_pool.Get(rack.empty() ? 7 : 1);
        std::cout << "\nRack: " << rack << '\n';
        // find words at least 3 letters long, using the letters
        // from "rack".  The only "missing" matches allowed are
        // 1 for each wildcard ('?') in "rack".
        word_list = trie.FindWords(rack, 0, 3);
    } while(word_list.empty() && draw_pool.Count());

    // Dump the results to cout
    std::cout << "\nFound " << word_list.size() << " matches\n";
    for(auto&& word : word_list) {
        std::cout << "    " << word << '\n';
    }
}

关于c++ - 我如何找到部分单词匹配/查找 C++,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36684995/

相关文章:

c++ - 重新定义格式参数错误

fork() 和 execl() 调用后无法识别 C++ 命令 (Linux)

c++ - 错误 : prototype does not match any in class

c++ - 如何让 DLL 项目更新 visual studio 中 C++ 项目中的每个构建

python - 将包含特定字符串的行值移动到 Python 中的新列

java - 替换java中字符串的一部分

javascript - 如何从字符串中删除某个索引后的所有字符

ios - 类型 'Any' 不符合协议(protocol) 'Equatable' 尝试符合可等式泛型集时

python - 设置大小正在改变,即使它不应该 python

reactjs - 如何声明Set()类型的状态变量?