gpt4 book ai didi

python - 是否有用于训练对数线性模型的 python 包?

转载 作者:太空宇宙 更新时间:2023-11-03 15:20:10 25 4
gpt4 key购买 nike

有谁知道python中是否有任何现有的包来训练对数线性模型?我有一个包含 2000 个变量和 1000 条记录的数据集。我正在寻找使用对数线性模型来估计频率。

最佳答案

如果您使用旧版本的 SciPy(即 0.10 或更早版本),您可以使用 scipy.maxentropy (在 NLP 中,MaxEnt = 最大熵建模 = 对数线性模型)。当 0.11.0 版本发布时,该模块已从 SciPy 中删除,SciPy 团队随后 advised使用 sklearn.linear_model.LogisticRegression作为替代(注意 both 对数线性模型和逻辑回归是 generalized linear models 的示例,其中线性预测变量之间的关系)。

Example使用 SciPy 的最大熵模块(在 SciPy 0.11.0 中删除):

#!/usr/bin/env python

""" Example use of the maximum entropy module:

Machine translation example -- English to French -- from the paper 'A
maximum entropy approach to natural language processing' by Berger et
al., 1996.

Consider the translation of the English word 'in' into French. We
notice in a corpus of parallel texts the following facts:

(1) p(dans) + p(en) + p(a) + p(au cours de) + p(pendant) = 1
(2) p(dans) + p(en) = 3/10
(3) p(dans) + p(a) = 1/2

This code finds the probability distribution with maximal entropy
subject to these constraints.
"""

__author__ = 'Ed Schofield'
__version__= '2.1'

from scipy import maxentropy

a_grave = u'\u00e0'

samplespace = ['dans', 'en', a_grave, 'au cours de', 'pendant']

def f0(x):
return x in samplespace

def f1(x):
return x=='dans' or x=='en'

def f2(x):
return x=='dans' or x==a_grave

f = [f0, f1, f2]

model = maxentropy.model(f, samplespace)

# Now set the desired feature expectations
K = [1.0, 0.3, 0.5]

model.verbose = True

# Fit the model
model.fit(K)

# Output the distribution
print "\nFitted model parameters are:\n" + str(model.params)
print "\nFitted distribution is:"
p = model.probdist()
for j in range(len(model.samplespace)):
x = model.samplespace[j]
print ("\tx = %-15s" %(x + ":",) + " p(x) = "+str(p[j])).encode('utf-8')


# Now show how well the constraints are satisfied:
print
print "Desired constraints:"
print "\tp['dans'] + p['en'] = 0.3"
print ("\tp['dans'] + p['" + a_grave + "'] = 0.5").encode('utf-8')
print
print "Actual expectations under the fitted model:"
print "\tp['dans'] + p['en'] =", p[0] + p[1]
print ("\tp['dans'] + p['" + a_grave + "'] = " + str(p[0]+p[2])).encode('utf-8')
# (Or substitute "x.encode('latin-1')" if you have a primitive terminal.)

其他想法:http://homepages.inf.ed.ac.uk/lzhang10/maxent.html

关于python - 是否有用于训练对数线性模型的 python 包?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16231177/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com