gpt4 book ai didi

Python 通过 xml 抓取打印空括号

转载 作者:太空宇宙 更新时间:2023-11-04 03:04:14 24 4
gpt4 key购买 nike

我正在尝试通过 lxml 从网站中提取几个字符,然后是树,然后是 xpath。我试过使用谷歌浏览器来获取正确的 xpath,但它打印出空括号。

    #imports
from lxml import html
import requests

#get magicseaweed Scripps report
msScrippsPage = requests.get("""http://magicseaweed.com/Scripps-Pier-
La-Jolla-Surf-Report/296/.html""")

#make tree from site
msScrippsTree = html.fromstring(msScrippsPage.content)

#get wave size
msScrippsWave = msScrippsTree.xpath("""/html/body/div[2]/div[5]/div/div[1]/div[2]/div[2]/div/div[2]/div[1]/div/div[1]/div/div/div/div/div[1]/div/div[2]/ul[1]/li[1]/text()""")

print 'ms SCripps: ', msScrippsWave

终端的输出是'msScripps: [ ]'

最佳答案

您不应该在您的网址中使用换行符。当您使用一行时,您的 xpath 工作。

msScrippsPage = requests.get("""http://magicseaweed.com/Scripps-Pier-La-Jolla-Surf-Report/296/.html""")
print msScrippsPage.content
[' 0.4-0.6', ' ']
########################################
url = """http://magicseaweed.com/Scripps-Pier-
La-Jolla-Surf-Report/296/.html"""
print url
'http://magicseaweed.com/Scripps-Pier-\n La-Jolla-Surf-Report/296/.html'

编辑:添加完整示例

from lxml import html
import requests

msScrippsPage = requests.get("""http://magicseaweed.com/Scripps-Pier-La-Jolla-Surf-Report/296/.html""")
msScrippsTree = html.fromstring(msScrippsPage.content)
msScrippsWave = msScrippsTree.xpath("""/html/body/div[2]/div[5]/div/div[1]/div[2]/div[2]/div/div[2]/div[1]/div/div[1]/div/div/div/div/div[1]/div/div[2]/ul[1]/li[1]/text()""")
print 'ms SCripps: ', msScrippsWave

关于Python 通过 xml 抓取打印空括号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39987672/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com