Ruby - 如何提高数组扫描的性能？

部分基于ruby - how to generate the possible sequential combination of letters from an array of strings?我现在有一个匹配单词的程序，使用:

class Dictionary
  attr :words

  def words
    @words.map(&:upcase).uniq
  end 

  @@MAPPINGS= {A: 2, B: 2, C: 2, D: 3, E: 3, F: 3, G: 4, H: 4, I: 4, J: 5, K: 5, L: 5,
  M: 6, N: 6, O: 6, P: 7, Q: 7, R: 7, S: 7, T: 8, U: 8, V: 8, W: 9, X: 9, Y: 9, Z: 9}
  @@PHONE_NUMBER_LENGTH=10

  def initialize
    @words=[]
  end 

  def add_word(word)
    word.length < @@PHONE_NUMBER_LENGTH ? (@words << word) : nil 
  end 

  def load_system_dictionary(words_file='/usr/share/dict/american-english')
    File.open(words_file).each {|word| add_word(word)}
    true
  rescue Errno::ENOENT
    false
  end 

  def word_combinations(letters)
    possibles=[]
    letters.each_char do |one_letter|
      possibles << letter_mappings(one_letter)
    end 
    possibles.map(&:chars).map(&:to_a).reduce(&:product).map(&:join)
  end 

  def contains_word(word)
    @words.join.include?(word.upcase)
  end 

  def word_from_word_combinations(number_string)
    returns=[]
    word_combinations(number_string).each do |word|
      returns << word if @words.include?(word)
    end 
    returns
  end 

  private

  def letter_mappings(letter)
    @@MAPPINGS.select{ |key,val| val==letter.to_i }.keys.join
  end 

end

在不到十分之一秒的时间内就可以很好地处理短的和中等长度的单词，例如对于动物。然而，对于较长的单词，例如 MUMMIFICATION，即

it "should see that the valid words for 6866434228466 is 'MUMMIFICATION'" do
  expect(dictionary.word_from_word_combinations('6866434228466')).to match_array(['MUMMIFICATION'])
end

测试需要 30 秒。

我尝试在每个阶段添加.uniq

possibles.map(&:chars).map(&:to_a).reduce(&:product).map(&:join)

而且我还切换到将 ruby 2.0 作为我的默认值，但这只是增加了 6 秒的运行时间:(

我已经改用 sawa 的方法了:

first, *rest = possibles.map{|s| s.each_char.to_a}
first.product(*rest).map(&:join)

即

def word_combinations(letters)
  possibles=[]
  letters.each_char do |one_letter|
    possibles << letter_mappings(one_letter)
  end 
  first, *rest = possibles.map{|s| s.each_char.to_a}
  first.product(*rest).map(&:join) 
end

这很有帮助，将它减少到 15 秒，

来自 Marshal 的 .map(&:chars)，即

def word_combinations(letters)
  possibles=[]
  letters.each_char do |one_letter|
    possibles << letter_mappings(one_letter)
  end 
  first, *rest = possibles.map(&:chars)
  first.product(*rest).map(&:join) 
end

很有趣，但没有提高性能。

还有什么我可以做的吗？

最佳答案

words = File.read("/usr/share/dict/american-english")
  .split.map{|w| w.chomp.upcase}
mapping = {A: 2, B: 2, C: 2, D: 3, E: 3, F: 3, G: 4, H: 4, I: 4, J: 5, K: 5,
  L: 5, M: 6, N: 6, O: 6, P: 7, Q: 7, R: 7, S: 7, T: 8, U: 8, V: 8, W: 9, X: 9,
  Y: 9, Z: 9}
better_mapping = mapping.map{|k, v| [k.to_s, v]}.to_h

t = Time.now
p words.select{|w| w.chars.map{|c| better_mapping[c]}.join == "6866434228466"}
puts Time.now - t

结果:

["MUMMIFICATION"]
0.847988125

mapping = {A: 2, B: 2, C: 2, D: 3, E: 3, F: 3, G: 4, H: 4, I: 4, J: 5, K: 5,
  L: 5, M: 6, N: 6, O: 6, P: 7, Q: 7, R: 7, S: 7, T: 8, U: 8, V: 8, W: 9, X: 9,
  Y: 9, Z: 9}
better_mapping = mapping.map{|k, v| [k.to_s, v]}.to_h
words = File.read("/usr/share/dict/american-english")
  .split.map{|w|
    w = w.chomp.upcase
    [w, w.chars.map{|c| better_mapping[c]}.join]
  }.group_by(&:last)
.map{|k, a| [k, a.map(&:first)]}.to_h

t = Time.now
p words["6866434228466"]
puts Time.now - t

结果:

["MUMMIFICATION"]
8.5981e-05

关于Ruby - 如何提高数组扫描的性能？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/22432751/

Ruby - 如何提高数组扫描的性能？

上一篇：ruby - Rails Rake 任务 : syntax error, 意外 '\n'，期待 tASSOC

下一篇：ruby-on-rails - Assets 管道预编译所有 Assets 文件夹子目录