gpt4 book ai didi

ruby - 如何重构我的词频方法?

转载 作者:太空宇宙 更新时间:2023-11-03 18:04:11 25 4
gpt4 key购买 nike

这是我的方法 word_frequency

def frequencies(text)
words = text.split
the_frequencies = Hash.new(0)
words.each do |word|
the_frequencies[word] += 1
end
return the_frequencies
end

def most_common_words(file_name, stop_words_file_name, number_of_word)
# TODO: return hash of occurences of number_of_word most frequent words
opened_file_string = File.open(file_name.to_s).read.downcase.strip.split.join(" ").gsub(/[^a-zA-Z \'$]/, "").gsub(/'s/, "").split
opened_stop_file_string = File.open(stop_words_file_name.to_s).read.downcase.strip.split.join(" ").gsub(/[^a-zA-Z \']/, "").gsub(/'s/, "").split
# declarar variables de file_name stop words.
filtered_array = opened_file_string.reject { |n| opened_stop_file_string.include? n }
the_frequencies = Hash.new(0)
filtered_array.each do |word|
the_frequencies[word] += 1
end
store = the_frequencies.sort_by { |_key, value| value }.reverse[0..number_of_word - 1].to_h
store
end

效果很好,但我认为我可以做得更好。 Rubocop 说我的台词太长了,我同意,但这是我最好的。有人可以解释我如何才能做得更好吗?

最佳答案

如果只是分解大的部分就好了。 most_common_words 似乎仍然很微妙,你可以解释你正在尝试做什么,看看在那里还能做什么。

您还可以使用频率,并查看方法参数中的模式,OOP 方法更适合这里。

def join_file(file_name)
File.open(file_name).read.downcase.strip.split.join(' ')
end

def frequencies(text)
text.split.each_with_object(Hash.new(0)) { |word, hash| hash[word] += 1 }
end

def opened_file_string(file_name)
join_file(file_name).gsub(/[^a-zA-Z \'$]/, '').gsub(/'s/, '').split
end

def opened_stop_file_string(file_name)
@opened_stop_file_string ||= join_file(file_name).gsub(/[^a-zA-Z \']/, '').gsub(/'s/, '').split
end

def in_stop_file_string?(file_name, word)
opened_stop_file_string(file_name).include?(word)
end

def filtered_array(file_name, stop_words_file_name)
opened_file_string(file_name).reject do |word|
in_stop_file_string?(stop_words_file_name, word)
end
end

def frequencies_in_filtered_array(file_name, stop_words_file_name)
frequencies(filtered_array(file_name, stop_words_file_name)).sort_by { |_, value| value }
end

def most_common_words(file_name, stop_words_file_name, number_of_word)
frequencies_in_filtered_array(file_name.to_s, stop_words_file_name.to_s).reverse[0...number_of_word].to_h
end

关于ruby - 如何重构我的词频方法?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53035177/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com