gpt4 book ai didi

python - BucketIterator 抛出 'Field' 对象没有属性 'vocab'

转载 作者:行者123 更新时间:2023-11-28 17:59:45 27 4
gpt4 key购买 nike

这不是一个新问题,我找到的引用文献没有任何适合我的解决方案 firstsecond .我是 PyTorch 的新手,面对 AttributeError: 'Field' object has no attribute 'vocab' 使用 torchtext.

Deep Learning with PyTorch 这本书之后,我编写了与书中解释的相同的示例。

这是片段:

from torchtext import data
from torchtext import datasets
from torchtext.vocab import GloVe

TEXT = data.Field(lower=True, batch_first=True, fix_length=20)
LABEL = data.Field(sequential=False)
train, test = datasets.IMDB.splits(TEXT, LABEL)

print("train.fields:", train.fields)
print()
print(vars(train[0])) # prints the object



TEXT.build_vocab(train, vectors=GloVe(name="6B", dim=300),
max_size=10000, min_freq=10)

# VOCABULARY
# print(TEXT.vocab.freqs) # freq
# print(TEXT.vocab.vectors) # vectors
# print(TEXT.vocab.stoi) # Index

train_iter, test_iter = data.BucketIterator.splits(
(train, test), batch_size=128, device=-1, shuffle=True, repeat=False) # -1 for cpu, None for gpu

# Not working (FROM BOOK)
# batch = next(iter(train_iter))

# print(batch.text)
# print()
# print(batch.label)

# This also not working (FROM Second solution)
for i in train_iter:
print (i.text)
print (i.label)

这是堆栈跟踪:

AttributeError                            Traceback (most recent call last)
<ipython-input-33-433ec3a2ca3c> in <module>()
7
8
----> 9 for i in train_iter:
10 print (i.text)
11 print (i.label)

/anaconda3/lib/python3.6/site-packages/torchtext/data/iterator.py in __iter__(self)
155 else:
156 minibatch.sort(key=self.sort_key, reverse=True)
--> 157 yield Batch(minibatch, self.dataset, self.device)
158 if not self.repeat:
159 return

/anaconda3/lib/python3.6/site-packages/torchtext/data/batch.py in __init__(self, data, dataset, device)
32 if field is not None:
33 batch = [getattr(x, name) for x in data]
---> 34 setattr(self, name, field.process(batch, device=device))
35
36 @classmethod

/anaconda3/lib/python3.6/site-packages/torchtext/data/field.py in process(self, batch, device)
199 """
200 padded = self.pad(batch)
--> 201 tensor = self.numericalize(padded, device=device)
202 return tensor
203

/anaconda3/lib/python3.6/site-packages/torchtext/data/field.py in numericalize(self, arr, device)
300 arr = [[self.vocab.stoi[x] for x in ex] for ex in arr]
301 else:
--> 302 arr = [self.vocab.stoi[x] for x in arr]
303
304 if self.postprocessing is not None:

/anaconda3/lib/python3.6/site-packages/torchtext/data/field.py in <listcomp>(.0)
300 arr = [[self.vocab.stoi[x] for x in ex] for ex in arr]
301 else:
--> 302 arr = [self.vocab.stoi[x] for x in arr]
303
304 if self.postprocessing is not None:

AttributeError: 'Field' object has no attribute 'vocab'

If not using BucketIterator, what else I can use to get a similar output?

最佳答案

您还没有为 LABEL 字段构建词汇表。

TEXT.build_vocab(train, ...) 之后,运行 LABEL.build_vocab(train),其余的将运行。

关于python - BucketIterator 抛出 'Field' 对象没有属性 'vocab',我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56251267/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com