python - 不需要的 python feedparser 实例化遗迹-6ren

python - 不需要的 python feedparser 实例化遗迹

转载作者：太空宇宙更新时间：2023-11-04 06:31:29

问题:如何终止实例化或确保我正在创建 python 通用 feedparser 的新实例化？

信息:

我现在正在开发一个程序，可以下载和编目大量博客。除了一个不幸的错误外，它运行良好。我的代码设置为获取博客网址列表并通过 for 循环运行它们。每次运行它都会选择一个 url 并将其发送到一个单独的类，该类管理数据的下载、提取和保存到文件。

第一个 url 工作正常。它会下载整个博客并将其保存到一个文件中。但是它下载的第二个博客也将包含第一个博客的所有数据，我完全不知道为什么。

代码片段:

class BlogHarvester:
  def __init__(self,folder):
    f = open(folder,'r')
    stop = folder[len(folder)-1]
    while stop != '/':
        folder = folder[0:len(folder)-1]
        stop = folder[len(folder)-1]
    blogs = []
    for line in f:
        blogs.append(line)

    for herf in blogs:
        blog = BlogParser(herf)
        sPath = ""
        uid = newguid()##returns random hash.
        sPath = uid
        sPath = sPath + " - " + blog.posts[0].author[1:5] + ".blog"
        print sPath
        blog.storeAsFile(sPath)

class BlogParser:
  def __init__(self, blogherf='null', path='null', posts = []):
    self.blogherf = blogherf

    self.blog = feedparser.parse(blogherf)
    self.path = path
    self.posts = posts
    if blogherf != 'null':
        self.makeList()
    elif path != 'null':
        self.loadFromFile()

class BlogPeices:
  def __init__(self,title,author,post,date,publisher,rights,comments):
    self.author = author
    self.title = title
    self.post = post
    self.date = date
    self.publisher = publisher
    self.rights = rights
    self.comments = comments