gpt4 book ai didi

python - pyLDAvis : Validation error on trying to visualize topics with BTM

转载 作者:行者123 更新时间:2023-12-04 17:38:10 26 4
gpt4 key购买 nike

我尝试使用 BTM 生成主题.在尝试可视化主题时,我收到验证错误。我可以在模型训练后打印主题,但在使用 pyLDAvis 时失败

def btm_model():
num_topics = 10
texts = open('./textfiles/Ori-Apr2, 2019.txt').read().splitlines()
# vectorize texts
vec = CountVectorizer(stop_words='english')
X = vec.fit_transform(texts).toarray()
# get vocabulary
vocab = np.array(vec.get_feature_names())
# get biterms
biterms = vec_to_biterms(X)
# create btm
btm = oBTM(num_topics = num_topics, V = vocab)
print("\n\n Train Online BTM ..")
for i in range(0, 1):
biterms_chunk = biterms[i:i + 100]
btm.fit(biterms_chunk, iterations=10)

print("\n\n Topic coherence ..")
res, C_z_sum = topic_summuary(btm.phi_wz.T, X, vocab, 10)

topics = btm.transform(biterms)
print("\n\n Visualize Topics ..")
vis = pyLDAvis.prepare(btm.phi_wz.T, topics, np.count_nonzero(X, axis=1), vocab, np.sum(X, axis=0))
pyLDAvis.save_html(vis, './textfiles/online_btm.html')

在 pyLDAvis 上运行后,我在尝试时遇到以下错误

Traceback (most recent call last):
File "main_mining.py", line 293, in <module>
btm_model(num_topics)
File "main_mining.py", line 187, in btm_model
vis = pyLDAvis.prepare(btm.phi_wz.T, topics, np.count_nonzero(X, axis=1), vocab, np.sum(X, axis=0))
File "C:\Python Install Location\lib\site-packages\pyLDAvis\_prepare.py", line 375, in prepare
_input_validate(topic_term_dists, doc_topic_dists, doc_lengths, vocab, term_frequency)
File "C:\Python Install Location\lib\site-packages\pyLDAvis\_prepare.py", line 65, in _input_validate
raise ValidationError('\n' + '\n'.join([' * ' + s for s in res]))
pyLDAvis._prepare.ValidationError:
* Not all rows (distributions) in doc_topic_dists sum to 1.

最佳答案

在我的例子中,发生这种情况是因为我的一些句子只有几个标记。我删除了所有少于三个标记的句子,效果非常好。

关于python - pyLDAvis : Validation error on trying to visualize topics with BTM,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55712807/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com