python - 为什么我的 glob.glob 循环没有遍历文件夹中的所有文本文件？-6ren

python - 为什么我的 glob.glob 循环没有遍历文件夹中的所有文本文件？

转载作者：太空宇宙更新时间：2023-11-03 12:04:31

我正在尝试使用 python 3 从包含文本文档的文件夹中读取数据。具体来说，这是对 LingSpam 垃圾邮件数据集的修改。我希望我编写的代码返回所有 1893 个文本文档名称，但是，代码反而返回前 420 个文件名。我不明白为什么它会在文件名总数不足时停止。有什么想法吗？

if not os.path.exists('train'):  # download data
  from urllib.request import urlretrieve
  import tarfile
  urlretrieve('http://cs.iit.edu/~culotta/cs429/lingspam.tgz', 'lingspam.tgz')
  tar = tarfile.open('lingspam.tgz')
  tar.extractall()
  tar.close()
abc = []
for f in glob.glob("train/*.txt"):
  print(f)
  abc.append(f)
print(len(abc))

我已经尝试更改 glob 参数但仍然没有成功。

编辑:显然我的代码适用于除我以外的所有人。这是我的 output

最佳答案

成功了!问题是

if not os.path.exists('train'):  # download data

为了检查我的输出，我实际上已经将文件下载到我的计算机上，并且由于这一行检查了该文件夹是否存在，并且它确实存在，所以它导致了问题。我从我的机器上删除了文件，现在它可以正常工作了，尽管我怀疑它正在运行

  from urllib.request import urlretrieve
  import tarfile
  urlretrieve('http://cs.iit.edu/~culotta/cs429/lingspam.tgz', 'lingspam.tgz')
  tar = tarfile.open('lingspam.tgz')
  tar.extractall()
  tar.close()

没有 if 语句会得到相同的结果。

关于python - 为什么我的 glob.glob 循环没有遍历文件夹中的所有文本文件？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/36318619/

文章推荐：安卓。另一个 tablayout 的选项卡内的 tablayout

文章推荐： php - 如何解决 org.json.JSONException 中的错误？

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 为什么我的 glob.glob 循环没有遍历文件夹中的所有文本文件？