gpt4 book ai didi

python - 如何使用python从网页中提取数据

转载 作者:太空宇宙 更新时间:2023-11-03 17:26:58 24 4
gpt4 key购买 nike

有人能指出我做错了什么吗?

Enter Item name:Rockfish Traceback (most recent call last): File "C:\Users\partn_000\Desktop\sarvesh\Python Source Code\working\jellyneoscraper.py", line 45, in search(br, ITEMNAME) File "C:\Users\partn_000\Desktop\sarvesh\Python Source Code\working\jellyneoscraper.py", line 33, in search increment = increment[0] IndexError: list index out of range

这是我写的代码

#Library Imports
import mechanize
import cookielib
import re
import sys
import time
import os.path
from operator import itemgetter
import ctypes
ctypes.windll.kernel32.SetConsoleTitleA("test")


def init_browser():
br = mechanize.Browser()
br.set_handle_equiv(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36')]
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)

return br


def search(br, ITEMNAME):
datapage = br.open('http://items.jellyneo.net/index.php?go=show_items&name=' +ITEMNAME +'&name_type=exact&desc=&cat=0&specialcat=0&status=0&rarity=0&sortby=name&numitems=20')
f = open('search.html', 'w')
f.write(datapage.read())
f.close()
value = re.findall('style="font-weight:bold;">(.+) NP</a></td>"',datapage.read()) #(.+) is replaced in place of required value
value = value[0].replace(",","")
value = int(value)
print value
#http://items.jellyneo.net/index.php?go=show_items&name=Rockfish&name_type=exact&desc=&cat=0&specialcat=0&status=0&rarity=0&sortby=name&numitems=20


#('style="font-weight:bold;"> (.+) NP</a>"',search.read())


ITEMNAME = raw_input('Enter Item name:eg. Rockfish')

br = init_browser()
search(br, ITEMNAME)

最佳答案

在您的搜索方法中,您阅读整个页面并将其保存到文件中,然后您尝试重新读取它并执行正则表达式,但您已经位于页面末尾,因此它返回空字符串。您应该在再次读取之前添加 datapage.seek(0),如下所示:

datapage = br.open('http://items.jellyneo.net/index.php?go=show_items&name=' +ITEMNAME +'&name_type=exact&desc=&cat=0&specialcat=0&status=0&rarity=0&sortby=name&numitems=20')
f = open('search.html', 'w')
f.write(datapage.read())
f.close()
datapage.seek(0)
value = re.findall('style="font-weight:bold;">(.+) NP</a></td>"',datapage.read()) #(.+) is replaced in place of required value
value = value[0].replace(",","")
value = int(value)

关于python - 如何使用python从网页中提取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32426451/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com