gpt4 book ai didi

python - 如何重新安装 lxml?

转载 作者:太空狗 更新时间:2023-10-29 16:55:18 29 4
gpt4 key购买 nike

Python 版本和使用的设备

  • Python 2,7.5
  • Mac 10.7.5
  • BeautifulSoup 4.2.1。

我正在学习 BeautifulSoup 教程,但是当我尝试使用 lxml 库解析 xml 页面时,出现以下错误:

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested:
lxml,xml. Do you need to install a parser library?

我确定我已经通过所有方法安装了 lxml:easy_install、pip、port 等。我试图在我的代码中添加一行以查看是否安装了 lxml:

import lxml

然后 python 可以成功地通过这段代码并再次显示之前的错误消息,发生在同一行。

所以我很确定 lxml 已安装,但安装不正确。所以我决定卸载 lxml,然后使用“正确”的方法重新安装。但是当我输入

easy_install -m  lxml

我收到以下错误:

Searching for lxml
Best match: lxml 3.2.1
Processing lxml-3.2.1-py2.7-macosx-10.6-intel.egg

Using /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/lxml-
3.2.1-py2.7-macosx-10.6-intel.egg

Because this distribution was installed --multi-version, before you can
import modules from this package in an application, you will need to
'import pkg_resources' and then use a 'require()' call similar to one of
these examples, in order to select the desired version:

pkg_resources.require("lxml") # latest installed version
pkg_resources.require("lxml==3.2.1") # this exact version
pkg_resources.require("lxml>=3.2.1") # this version or higher

Processing dependencies for lxml
Finished processing dependencies for lxml

所以我不知道如何继续卸载,我在谷歌上查找了很多关于这个问题的帖子,但仍然找不到任何有用的信息。


这是我的源代码

import mechanize
from bs4 import BeautifulSoup
import lxml

class count:
def __init__(self,protein):
self.proteinCode = protein
self.br = mechanize.Browser()

def first_search(self):
#Test 0
soup = BeautifulSoup(self.br.open("http://www.ncbi.nlm.nih.gov/protein/21225921?report=genbank&log$=prottop&blast_rank=1&RID=YGJHMSET015"), ['lxml','xml'])
return

if __name__=='__main__':
proteinCode = sys.argv[1]
gogogo = count(proteinCode)

问题

  1. 如何卸载 lxml?
  2. 如何“正确”安装 lxml?我如何知道它已正确安装?

最佳答案

我正在使用 BeautifulSoup 4.3.2 和 OS X 10.6.8。我还遇到了 lxml 安装不正确的问题。以下是我发现的一些事情:

首先,检查这个相关问题:Removed MacPorts, now Python is broken

现在,为了检查安装了哪些 BeautifulSoup 4 构建器,尝试

>>> import bs4
>>> bs4.builder.builder_registry.builders

如果您没有看到您最喜欢的构建器,那么它没有安装,您将看到如上所示的错误(“找不到树构建器...”)。

此外,仅仅因为您可以import lxml,并不意味着一切都是完美的。

尝试

>>> import lxml
>>> import lxml.etree

要了解发生了什么,请转到 bs4 安装并打开 egg (tar -xvzf)。注意模块 bs4.builder。在其中,您应该会看到诸如 _lxml.py_html5lib.py 之类的文件。所以你也可以试试

>>> import bs4.builder.htmlparser
>>> import bs4.builder._lxml
>>> import bs4.builder._html5lib

如果有问题,您会看到为什么无法加载特定模块。您可以注意到在 builder/__init__.py 的末尾,它如何加载所有这些模块并忽略未加载的任何内容:

# Builders are registered in reverse order of priority, so that custom
# builder registrations will take precedence. In general, we want lxml
# to take precedence over html5lib, because it's faster. And we only
# want to use HTMLParser as a last result.
from . import _htmlparser
register_treebuilders_from(_htmlparser)
try:
from . import _html5lib
register_treebuilders_from(_html5lib)
except ImportError:
# They don't have html5lib installed.
pass
try:
from . import _lxml
register_treebuilders_from(_lxml)
except ImportError:
# They don't have lxml installed.
pass

关于python - 如何重新安装 lxml?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17766725/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com