gpt4 book ai didi

nltk - 运行错误 nltk.gaac.demo()

转载 作者:行者123 更新时间:2023-12-02 09:34:42 26 4
gpt4 key购买 nike

当我运行 nltk.gaac.demo() 时

如果我错过了什么,你能帮我吗?我收到以下错误。

我使用的是nltk 3.0.1

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit   (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.gaac.demo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python34\lib\site-packages\nltk\cluster\gaac.py", line 150, in demo
clusters = clusterer.cluster(vectors, True)
File "C:\Python34\lib\site-packages\nltk\cluster\gaac.py", line 41, in cluster
return VectorSpaceClusterer.cluster(self, vectors, assign_clusters, trace)
File "C:\Python34\lib\site-packages\nltk\cluster\util.py", line 57, in cluster
self.cluster_vectorspace(vectors, trace)
File "C:\Python34\lib\site-packages\nltk\cluster\gaac.py", line 79, in cluster_vectorspace
self.update_clusters(self._num_clusters)
File "C:\Python34\lib\site-packages\nltk\cluster\gaac.py", line 99, in update_clusters
clusters = self._dendrogram.groups(num_clusters)
File "C:\Python34\lib\site-packages\nltk\cluster\util.py", line 213, in groups
return root.groups(n)
File "C:\Python34\lib\site-packages\nltk\cluster\util.py", line 161, in groups
queue.sort()
TypeError: unorderable types: _DendrogramNode() < _DendrogramNode()

最佳答案

这似乎是 Python 2.x 和 3.x 之间的 nltk 模块兼容性问题。我在下面解释,你可以破解最后一节中的解决方案

说明

在我的机器上,在 Python 2.7 中,nltk.gaac.demo() 产生:

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.gaac.demo()
None [array([ 0.70710678, 0.70710678]), array([ 0.4472136 , 0.89442719]), arra
y([ 0.89442719, 0.4472136 ]), array([ 1., 0.]), array([ 0.5547002 , 0.8320502
9]), array([ 0.9486833 , 0.31622777])]
Clusterer: <GroupAverageAgglomerative Clusterer n=4>
Clustered: [array([3, 3]), array([1, 2]), array([4, 2]), array([4, 0]), array([2
, 3]), array([3, 1])]
As: [0, 2, 3, 1, 2, 3]

+---------+---------+---------+
| | | |
| | +-----------------------------+
| | | | |
| +-----------------------------+ |
| | | | | |
[ 3. 3.] [ 1. 2.] [ 4. 2.] [ 4. 0.] [ 2. 3.] [ 3. 1.]
classify([3 3]): 0

而在 Python 3.3 中,我看到了 Python 3.4.1 的确切行为 OP 报告。

我已提出错误报告 nltk developers here .

This blog关于将 Python 2 迁移到 Python 3 的说明:

Unorderable types, cmp and cmp Under Python 2 the most common way of making types sortable is to implement a cmp() method that in turn uses the builtin cmp() function

...

Since having both cmp() and rich comparison methods violates the principle of there being only one obvious way of doing something, Python 3 ignores the cmp() method. In addition to this, the cmp() function is gone! This typically results in your converted code raising a TypeError: unorderable types error. So you need to replace the cmp() method with rich comparison methods instead. To support sorting you only need to implement lt(), the method used for the “less then” operator, <.

解决方案

要开始 - 将 __lt__() 函数添加到 _DendrogramNode 类:

  • 在您选择的编辑器中打开 C:\Python34\Lib\site-packages\nltk\cluster\util.py
  • 找到 class _DendrogramNode(object) 行(我的安装中的第 129 行)
  • 添加一个小于函数 - 因此您的代码如下所示:

    class _DendrogramNode(object):
    """ Tree node of a dendrogram. """<br/>
    def __lt__(self, comparator):
    return self._value.any() < comparator._value.any()

  • 最后一步(考虑 Python 3 中的新除法规则)

  • 找到行return '%s%s%s' % (lhalf*left, center, right*rhalf)(我的第247行加上上面的内容)

  • 替换为 return '%s%s%s' % (int(lhalf)*left, center, right*int(rhalf))

然后你会得到你想要的输出:

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64 bit (AM
D64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.gaac.demo()
None [array([ 0.70710678, 0.70710678]), array([ 0.4472136 , 0.89442719]), arra
y([ 0.89442719, 0.4472136 ]), array([ 1., 0.]), array([ 0.5547002 , 0.8320502
9]), array([ 0.9486833 , 0.31622777])]
Clusterer: <GroupAverageAgglomerative Clusterer n=4>
Clustered: [array([3, 3]), array([1, 2]), array([4, 2]), array([4, 0]), array([2
, 3]), array([3, 1])]
As: [0, 2, 3, 1, 2, 3]

+---------+---------+---------+
| | | |
| | +-----------------------------+
| | | | |
| +-----------------------------+ |
| | | | | |
[ 3. 3.] [ 1. 2.] [ 4. 2.] [ 4. 0.] [ 2. 3.] [ 3. 1.]
classify([3 3]): 0

我的被黑客攻击的 util.py 文件版本可作为 github gist 获取。 。

关于nltk - 运行错误 nltk.gaac.demo(),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28323575/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com