gpt4 book ai didi

python - 如何将对象传递给使用 hyperopt 优化的函数?

转载 作者:行者123 更新时间:2023-11-28 22:39:52 27 4
gpt4 key购买 nike

我是 hyperopt 包的新手。现在,我想优化我在 gensim 中实现的 LDA 模型。 LDA 模型经过优化以最大化训练数据的轮廓分数。

现在,我的问题是“如何将训练数据 (numpy.ndarray) 传递给从 hyperopt 调用的目标函数?”我看了教程和一些example codes .他们将训练数据设置为全局变量。但在我的情况下,很难像他们那样将训练数据设置为全局变量。

我编写了以下代码来使用 hyoeropt 优化 LDA。我正在研究将训练数据传递给 gensim_objective_function 函数的方法,因为我要将 gensim_lda_optimaze 放入调用 gensim_lda_optimaze 函数的系统中。

如何实现?

# I want to pass training data to this function!
# gensim_lda_tuning_training_corpus, gensim_lda_tuning_num_topic, gensim_lda_tuning_word2id is what I wanna pass
def gensim_objective_function(arg_dict):
from .gensim_lda import evaluate_clustering
from .gensim_lda import call_lda_single
from .gensim_lda import get_topics_ids

alpha = arg_dict['alpha']
eta = arg_dict['eta']
iteration= arg_dict['iteration']
gamma_threshold= arg_dict['gamma_threshold']
minimum_probability= arg_dict['minimum_probability']
passes= arg_dict['passes']
# train LDA model
lda_model, gensim_corpus = call_lda_single(matrix=gensim_lda_tuning_training_corpus,
num_topics=gensim_lda_tuning_num_topic,
word2id_dict=gensim_lda_tuning_word2id,
alpha=alpha, eta=eta,
iteration=iteration,
gamma_threshold=gamma_threshold,
minimum_probability=minimum_probability,
passes=passes)
topic_ids = get_topics_ids(trained_lda_model=lda_model, gensim_corpus=gensim_corpus)
labels = [t[0] for t in topic_ids]
# get silhouette score with extracted label
evaluation_score = evaluate_clustering(feature_matrix=gensim_lda_tuning_training_corpus, labels=numpy.array(labels))

return -1 * evaluation_score


def gensim_lda_optimaze(feature_matrix, num_topics, word2id_dict):
assert isinstance(feature_matrix, (ndarray, csr_matrix))
assert isinstance(num_topics, int)
assert isinstance(word2id_dict, dict)

parameter_space = {
'alpha': hp.loguniform("alpha", numpy.log(0.1), numpy.log(1)),
'eta': hp.loguniform("eta", numpy.log(0.1), numpy.log(1)),
'iteration': 100,
'gamma_threshold': 0.001,
'minimum_probability': 0.01,
'passes': 10
}
trials = Trials()

best = fmin(
gensim_objective_function,
parameter_space,
algo=tpe.suggest,
max_evals=100,
trials=trials
)

return best

最佳答案

您始终可以在 python 中使用 partial

from functools import partial

def foo(params, data):
return params, data

goo = partial(foo, data=[1,2,3])

print goo('ala')

给予

ala [1, 2, 3]

换句话说,您创建了一个代理函数,该函数将数据作为给定参数加载,并且您要求 hyperopt 优化这个新函数,其中已经设置了数据。

因此在您的情况下,您将 gensim_objective_function 更改为接受所有参数的东西:

def RAW_gensim_objective_function(arg_dict, gensim_lda_tuning_training_corpus, 
gensim_lda_tuning_num_topic,
gensim_lda_tuning_word2id):

并通过在代码的不同部分传递数据来创建实际函数以进行优化

gensim_objective_function = partial(RAW_gensim_objective_function, 
gensim_lda_tuning_training_corpus = YOUR_CORPUS,
gensim_lda_tuning_num_topic = YOUR_NUM_TOPICS,
gensim_lda_tuning_word2id = YOUR_IDs)

关于python - 如何将对象传递给使用 hyperopt 优化的函数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34325915/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com