gpt4 book ai didi

Python BeautifulSoup Mix 匹配表中的项目

转载 作者:太空宇宙 更新时间:2023-11-03 18:00:16 24 4
gpt4 key购买 nike

我尝试使用 Bs4 从表中选择数据并将其存储到 csv 文件中,但列是混合匹配的。我觉得 if 条件中的 HTML 语句是错误的。

def grab_daily_data(self): 
url_template='http://boxofficemojo.com/movies/?page=daily&view=chart&id=%s.htm'
#url=http://www.boxofficemojo.com/movies/?page=daily&view=chart&id=hungergames3.htm #Testing
for val in self.mov_id:
print 'parsing through: %s'%val
url=url_template%val
response = requests.get(url)
soup = BeautifulSoup(response.content)

alltables=soup.findAll("table", {"border":"0", "width":"95%"})
in_mainbody=False
i=0;counter=0;test_arr=[]; change=[]
for table in alltables:
rows=table.findAll('tr')
for tr in rows:
cols=tr.findAll('td')
for td in cols:
test=td.text

if i>=17:
if counter%10==0:
print test
self.day_num.append(test)
counter+=1


i+=1

我的问题是该列向左移动 1,并且每 7 行又移动一次。

示例输出:而不是打印出来:

1 
2
3
4
5
6
7
8
9
10...

打印出来:

Fri
Sat
Sun
Mon
Tue
Wed
Thu

8
9
10
11
12
13
14

最佳答案

问题是您没有到达适当的

依靠图表元素,得到next table sibling并查找其中的所有行:

from bs4 import BeautifulSoup
import requests

url = 'http://www.boxofficemojo.com/movies/?page=daily&view=chart&id=hungergames3.htm'

response = requests.get(url)
soup = BeautifulSoup(response.content)

for tr in soup.find('div', id='chart_container').find_next_sibling('table').find_all('tr')[1:]:
print [td.text for td in tr('td')]

打印:

[u'Fri', u'Nov. 21, 2014', u'1', u'$55,139,942', u'-', u'-', u'4,151', u'$13,284', u'$55,139,942', u'1']
[u'Sat', u'Nov. 22, 2014', u'1', u'$40,905,873', u'-25.8%', u'-', u'4,151', u'$9,854', u'$96,045,815', u'2']
[u'Sun', u'Nov. 23, 2014', u'1', u'$25,851,819', u'-36.8%', u'-', u'4,151', u'$6,228', u'$121,897,634', u'3']
[u'Mon', u'Nov. 24, 2014', u'1', u'$8,978,318', u'-65.3%', u'-', u'4,151', u'$2,163', u'$130,875,952', u'4']
[u'Tue', u'Nov. 25, 2014', u'1', u'$12,131,853', u'+35.1%', u'-', u'4,151', u'$2,923', u'$143,007,805', u'5']
[u'Wed', u'Nov. 26, 2014', u'1', u'$14,620,517', u'+20.5%', u'-', u'4,151', u'$3,522', u'$157,628,322', u'6']
[u'Thu', u'Nov. 27, 2014', u'1', u'$11,079,983', u'-24.2%', u'-', u'4,151', u'$2,669', u'$168,708,305', u'7']
[u'']
[u'Fri', u'Nov. 28, 2014', u'1', u'$24,199,442', u'+118.4%', u'-56.1%', u'4,151', u'$5,830', u'$192,907,747', u'8']
[u'Sat', u'Nov. 29, 2014', u'1', u'$21,992,225', u'-9.1%', u'-46.2%', u'4,151', u'$5,298', u'$214,899,972', u'9']
[u'Sun', u'Nov. 30, 2014', u'1', u'$10,780,932', u'-51.0%', u'-58.3%', u'4,151', u'$2,597', u'$225,680,904', u'10']
[u'Mon', u'Dec. 1, 2014', u'1', u'$2,635,435', u'-75.6%', u'-70.6%', u'4,151', u'$635', u'$228,316,339', u'11']
[u'Tue', u'Dec. 2, 2014', u'1', u'$3,160,145', u'+19.9%', u'-74.0%', u'4,151', u'$761', u'$231,476,484', u'12']
[u'Wed', u'Dec. 3, 2014', u'1', u'$2,332,453', u'-26.2%', u'-84.0%', u'4,151', u'$562', u'$233,808,937', u'13']
[u'Thu', u'Dec. 4, 2014', u'1', u'$2,317,894', u'-0.6%', u'-79.1%', u'4,151', u'$558', u'$236,126,831', u'14']
...

关于Python BeautifulSoup Mix 匹配表中的项目,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27793543/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com