c++ - 避免或改进暴力破解方法 : Counting character repetition from all words in a dictionary text file

我编写了这个实用函数，它将获取字母字典文件的内容，并将每个字母或字母表字符的重复计数相加。

这是我目前所拥有的:

#include <algorithm>
#include <fstream>
#include <iostream>
#include <map>
#include <string>
#include <vector>

// this function just generates a map of each of the alphabet's
// character position within the alphabet. 
void initCharIndexMap( std::map<unsigned, char>& index ) {
    char c = 'a';
    for ( unsigned i = 1; i < 27; i++ ) {
        index[i] = c;
        c++;
    }
} 

void countCharacterRepetition( std::vector<std::string>& words, const std::map<unsigned, char> index, std::map<char, unsigned>& weights ) {
    unsigned count = 0;

    for ( auto& s : words ) {
        std::transform(s.begin(), s.end(), s.begin(), ::tolower );

        for ( std::size_t i = 0; i < s.length(); i++ ) {
            using It = std::map<unsigned, char>::const_iterator;
            for ( It it = index.cbegin(); it != index.cend(); ++it ) {
                if ( s[i] == it->second ) {
                    count++;
                    weights[it->second] += count;
                }
                count = 0;
            }
        }
    }
}

int main() {
    std::vector<std::string> words;
    std::string line;

    std::ifstream file;
    file.open( "words_alpha.txt" );

    while( std::getline( file, line )
        words.push_back(line);

    std::map<unsigned, char> index;
    initCharIndexMap(index);

    std::map<char, unsigned> weights;
    countCharRepetition(words, index, weights);

    for (auto& w : weights)
        std::cout << w.first << ' ' << w.second << '\n';

     return EXIT_SUCCESS;
 }

它给了我这个输出，乍一看似乎是有效的:

我正在使用的词典文本文件可以从这个 github 页面找到。

这似乎有效。在我当前的机器上处理大约需要 3 分钟，这并不可怕，但是，这似乎是一种蛮力 方法。有没有更有效的方法来完成这样的任务？

最佳答案

如果您只是计算每个字符出现的次数，那么您只需要:

int frequency[26] = {};
for (auto const& str : words) {
  for (int i=0; i<str.size(); i++) {
    frequency[tolower(str[i]) - 'a']++;
  }
}

for (int i=0; i<26; i++) {
  cout << char(i + 'a') << " " << frequency[i] << endl;
}

如果你想包括大写和小写字符，将数组大小更改为 90，删除 tolower 调用，并更改你的循环以便它仅在 i 时打印介于 a 和 z 或 A 和 Z 之间。

关于c++ - 避免或改进暴力破解方法 : Counting character repetition from all words in a dictionary text file，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56498637/

c++ - 避免或改进暴力破解方法 : Counting character repetition from all words in a dictionary text file

上一篇：c++ - 知道 fscanf(fp, "%s", strr1) 中参数的大小

下一篇：c++ - 如何在函数中分配指向新对象的指针，而该对象在编辑后不会消失