ruby - 这个 Ruby 应用程序如何知道选择句子的中间三分之一?

标签 ruby slice

我目前正在关注 Peter Cooper 的 Beginning Ruby,并组装了我的第一个应用程序,一个文本分析器。然而,虽然我了解所有概念及其工作方式,但我一生都无法理解应用程序如何知道从这一行中选择按长度排序的中间三分之一的句子:

ideal_sentances = sentences_sorted.slice(one_third, one_third + 1)


#analyzer.rb --Text Analyzer

stopwords = %w{the a by on for of are with just but and to the my I has some in do}
lines = File.readlines(ARGV[0]) 
line_count = lines.size 
text = lines.join 

#Count the characters
character_count = text.length 
character_count_nospaces = text.gsub(/\s+/, '').length

#Count the words, sentances, and paragraphs
word_count = text.split.length 
paragraph_count = text.split(/\n\n/).length 
sentence_count = text.split(/\.|\?|!/).length

#Make a list of words in the text that aren't stop words,
#count them, and work out the percentage of non-stop words
#against all words
all_words = text.scan(/\w+/)
good_words = {|word| !stopwords.include?(word)}
good_percentage = ((good_words.length.to_f / all_words.length.to_f)*100).to_i

#Summarize the text by cherry picking some choice sentances
sentances = text.gsub(/\s+/, ' ').strip.split(/\.|\?|!/)
sentances_sorted = sentences.sort_by { |sentence| sentance.length }
one_third = sentences_sorted.length / 3
ideal_sentances = sentences_sorted.slice(one_third, one_third + 1)
ideal_sentances ={ |sentence| sentence =~ /is|are/ }

#Give analysis back to user

puts "#{line_count} lines" 
puts "#{character_count} characters" 
puts "#{character_count_nospaces} characters excluding spaces" 
puts "#{word_count} words" 
puts "#{paragraph_count} paragraphs" 
puts "#{sentence_count} sentences" 
puts "#{sentence_count / paragraph_count} sentences per paragraph (average)" 
puts "#{word_count / sentence_count} words per sentence (average)"
puts "#{good_percentage}% of words are non-fluff words"
puts "Summary:\n\n" + ideal_sentences.join(". ")
puts "-- End of analysis."



它会使用 one_third = Sentences_sorted.length/3 获取句子长度的三分之一,然后是您发布的行 ideal_sentances = statements_sorted.slice(one_third, one_third + 1) code> 表示“从等于 1/3 的索引开始抓取所有句子的切片,并继续长度的 1/3 +1”。


