ruby – 打印文件中最常用的n个单词(字符串)
目标:
我的想法是,在最坏的情况下,函数的输入可以是文档中的单词总数,从而减少了按照频率对单词进行排序的问题.这使我认为如果我使用比较排序方法,时间复杂度的下限将是O(n log n).所以,我的想法是,最好的方法是实现计数排序.这是我的代码. 我想告诉我,我的分析是否正确,我已经用我对时间复杂度的概念注释了代码,但它肯定是不正确的.这段代码的实际时间和空间复杂度是多少?如果有任何替代方法可以在实践中使用,我还想听听这是否是一个好方法. ### n is number of characters in string,k is number of words ### def word_frequencies(string,n) words = string.split(/s/) # O(n) max = 0 min = Float::INFINITY frequencies = words.inject(Hash.new(0)) do |hash,word| # O(k) occurrences = hash[word] += 1 # O(1) max = occurrences if occurrences > max # O(1) min = occurrences if occurrences < min # O(1) hash; # O(1) end ### perform a counting sort ### sorted = Array.new(max + words.length) delta = 0 frequencies.each do |word,frequency| #O(k) p word + "--" + frequency.to_s index = frequency if sorted[index] sorted[index] = sorted[index].push(word) # ??? I think O(1). else sorted[index] = [word] # O(1) end end return sorted.compact.flatten[-n..-1].reverse ### Compact is O(k). Flatten is O(k). Reverse is O(k). So O(3k) end ### Total --- O(n + 5k) = O(n). Correct? ### And the space complexity is O(n) for the hash + O(2k) for the sorted array. ### So total O(n). text = "hi hello hi my name is what what hi hello hi this is a test test test test hi hi hi what hello these are some words these these" p word_frequencies(text,4) 解决方法
两种方式:
def word_counter(string,max) string.split(/s+/) .group_by{|x|x} .map{|x,y|[x,y.size]} .sort_by{|_,size| size} # Have to sort =/ .last(max) end def word_counter(string,max) # Create a Hash and a List to store values in. word_counter,max_storage = Hash.new(0),[] #Split the string an and add each word to the hash: string.split(/s+/).each{|word| word_counter[word] += 1} # Take each word and add it to the list (so that the list_index = word_count) # I also add the count,but that is not really needed word_counter.each{|key,val| max_storage[val] = [*max_storage[val]] << [key,val]} # Higher count will always be at the end,remove nils and get the last "max" elements. max_storage.compact.flatten(1).last(max) end (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |