python - 元组没有属性 'isdigit'-6ren

python - 元组没有属性 'isdigit'

转载作者：太空宇宙更新时间：2023-11-04 06:54:55

26

4

我需要使用 NLTK 模块进行一些文字处理，但出现此错误:AttributeError: 'tuple' 对象没有属性 'isdigit'

有人知道如何处理这个错误吗？

Traceback (most recent call last):
  File "preprocessing-edit.py", line 36, in <module>
    postoks = nltk.tag.pos_tag(tok)
NameError: name 'tok' is not defined

PS C:\Users\moham\Desktop\Presentation> python preprocessing-edit.py
Traceback (most recent call last):
  File "preprocessing-edit.py", line 37, in <module>
    postoks = nltk.tag.pos_tag(tok)
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\__init__.py", line 111, in pos_tag
    return _pos_tag(tokens, tagset, tagger)
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\__init__.py", line 82, in _pos_tag
    tagged_tokens = tagger.tag(tokens)
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\perceptron.py", line 153, in tag
    context = self.START + [self.normalize(w) for w in tokens] + self.END
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\perceptron.py", line 153, in <listcomp>
    context = self.START + [self.normalize(w) for w in tokens] + self.END
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\perceptron.py", line 228, in normalize
    elif word.isdigit() and len(word) == 4:
AttributeError: 'tuple' object has no attribute 'isdigit'

import nltk

with open ("SHORT-LIST.txt", "r",encoding='utf8') as myfile:
    text =  (myfile.read().replace('\n', ''))

#text = "program managment is complicated issue for human workers"

# Used when tokenizing words
sentence_re = r'''(?x)      # set flag to allow verbose regexps
      ([A-Z])(\.[A-Z])+\.?  # abbreviations, e.g. U.S.A.
    | \w+(-\w+)*            # words with optional internal hyphens
    | \$?\d+(\.\d+)?%?      # currency and percentages, e.g. $12.40, 82%
    | \.\.\.                # ellipsis
    | [][.,;"'?():-_`]      # these are separate tokens
'''

lemmatizer = nltk.WordNetLemmatizer()
stemmer = nltk.stem.porter.PorterStemmer()


grammar = r"""
    NBAR:
        {<NN.*|JJ>*<NN.*>}  # Nouns and Adjectives, terminated with Nouns

    NP:
        {<NBAR>}
        {<NBAR><IN><NBAR>}  # Above, connected with in/of/etc...
"""
chunker = nltk.RegexpParser(grammar)

tok = nltk.regexp_tokenize(text, sentence_re)

postoks = nltk.tag.pos_tag(tok)

#print (postoks)

tree = chunker.parse(postoks)

from nltk.corpus import stopwords
stopwords = stopwords.words('english')


def leaves(tree):
    """Finds NP (nounphrase) leaf nodes of a chunk tree."""
    for subtree in tree.subtrees(filter = lambda t: t.label()=='NP'):
        yield subtree.leaves()

def normalise(word):
    """Normalises words to lowercase and stems and lemmatizes it."""
    word = word.lower()
    word = stemmer.stem_word(word)
    word = lemmatizer.lemmatize(word)
    return word

def acceptable_word(word):
    """Checks conditions for acceptable word: length, stopword."""
    accepted = bool(2 <= len(word) <= 40
        and word.lower() not in stopwords)
    return accepted


def get_terms(tree):
    for leaf in leaves(tree):
        term = [ normalise(w) for w,t in leaf if acceptable_word(w) ]
        yield term

terms = get_terms(tree)


with open("results.txt", "w+") as logfile:
    for term in terms: 
        for word in term:
            result = word
            logfile.write("%s\n" % str(word))
#           print (word),
#       (print)

logfile.close()

最佳答案

另一种简单的方法是更改这部分:

tok = nltk.regexp_tokenize(text, sentence_re)
postoks = nltk.tag.pos_tag(tok)

并将其替换为 nltk 标准单词分词器:

toks = nltk.word_tokenize(text)
postoks = nltk.tag.pos_tag(toks)

从理论上讲，性能和结果应该没有太大差异。

关于python - 元组没有属性 'isdigit'，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34097264/

26

4

0

文章推荐： python - 无法在我的 mac 上安装 Scrapy

文章推荐： java - 游戏中的 UDP 数据报丢失时会发生什么？

文章推荐： java - 无法通过jsp显示数据库中的所有信息

文章推荐： python - Python 中的方法重载 : more overloading

java - 字符.isDigit()错误: no suitable method found for isDigit(String)
Kepp 在使用 Character.isDigit() 时遇到错误我在其他地方查找过它并在那里进行了良好的测试，但我在这里不断遇到此错误。 Scanner scnr = new Scanner
c - isdigit() 有效电话号码的输入测试
我想创建一个算法来检查用户是否输入了有效的电话号码(仅限数字)。该程序应不断提示用户输入电话号码，直到它是严格的数字。我想出了以下内容。程序限制:必须使用 malloc(realloc 可以)，ph
java - isDigit 验证
我正在尝试验证 1 或 2 的输入。但是，使用此代码，如果您输入字母，则会导致程序崩溃。如何解决这个问题？ System.out.print("Choice: "); userSelection
c - 如何使该代码工作 (isdigit)
这个问题已经有答案了: Check if input is float else stop (6 个回答) 已关闭 8 年前。所以，我正在创建一个计算三角形面积的程序，我需要它来告诉用户他是否输入了
c++ - isdigit() 函数传递一个中文参数
当我尝试使用带有汉字的isdigit() 函数时，它在 Debug模式下的Visual Studio 2013 中报告断言，但在 Release模式下没有问题。我想如果这个函数是判断参数是否为数字，
C++ - isdigit 无法正常工作并导致永无止境的循环
我正在创建一个将十进制值转换为二进制值的程序。我遇到的问题是，在我的 if 语句中，我正在检查我的 int decimal 变量的用户输入是否包含数字，然后再继续转换值，但是当它是数字时，它将它们视为
c++ - isdigit() 不工作？
当我输入一个数字并用 isdigit 测试时，它总是返回 false，为什么？ #include using namespace std; int main() { int num;
c++ - isdigit，如何区分字符和数字
这个问题在这里已经有了答案: how to check if given c++ string or char* contains only digits? (8 个答案) 关闭 6 年前。我想检
c++ - isdigit() 不能在一个简单的程序中工作？
#include #include #include void main() { clrscr(); int a; cout>a; if(isdigit(a))
python - 元组没有属性 'isdigit'
我需要使用 NLTK 模块进行一些文字处理，但出现此错误:AttributeError: 'tuple' 对象没有属性 'isdigit' 有人知道如何处理这个错误吗？ Traceback (most
c - isdigit() 返回错误结果？
我总是通过使用 isdigit() 函数将任何长度为 2 的数字字符串检测为非数字，这是代码: void testdigi(){ char* tt="22"; char* tt2= "
c++ - isdigit() 意外行为？
为什么 isdigit 没有按预期工作。我正在尝试检查输入是否为数字。如果输入是数字，则打印 True 否则打印 False。 #include #include int main() { i
c - 函数中的 isdigit()
我有一个函数接受用户输入的整数。所以我有: scanf("%d, &x); 然后，函数: test(int x); 在 test() 中，我想检查输入的是数字还是字符，所以我尝试了: if (isd
c - isdigit 函数的段错误
我正在尝试检查第三个命令行是否为数字，所以我做了 int n; if (!isdigit(argv[3])) { fprintf(stderr, "n MUST be a nu
c - isdigit() 段错误
当我尝试在命令行上将数字传递到我的应用程序时，以下代码出现奇怪的段错误。 int offset = 3; int main(int argc, char *argv[]) { // Check
c - isdigit() 包括检查空格
我正在做一些 IO，其中一行是 number number，但是当我使用时， if(isdigit(buffer) > 0) { ... } 它失败了，我相信这是因为每个数字之间有一个空格。有没有办法
c - isdigit() 函数没有按预期工作？
这是我的代码: #include #include int main(void) { int limit; float sum=0; while(1){ p
c++ - 使条件 isdigit()
先看我的代码: #‎include‬ #include using namespace std; int main() { int a,b; cout > a >> b;
c++ - isdigit() 总是通过检查
你好，我想检查我的程序，如果用户输入的不是数字，而不是输入数字。所以我做了这个功能 void ValidationController::cinError(int *variable){ i
c++ - isdigit 无法正常工作
我正在尝试通过遍历整个字符串并输出整数来测试字符串是否包含整数。我的方法涉及将字符串转换为 c 字符串，atoi c 字符串，然后使用 isdigit 函数测试它是否为整数。由于某些未知原因，isdi

首页

博学

6Ren·AI

商城

python - 元组没有属性 'isdigit'