gpt4 book ai didi

python - 无法让 PythonAnywhere 为我抓取网络

转载 作者:太空宇宙 更新时间:2023-11-03 13:48:51 24 4
gpt4 key购买 nike

我一直在 PythonAnywhere 上进行试验,试图让一些 python 在网络服务器上工作。我最初从 Arvixe 切换过来是因为他们运行的是 2.4,而且 PythonAnywhere 的名字太吸引人了。

我的应用程序包含两个文件:phones.py 和 phonesearch.py​​。他们应该一起在 craigslist 上搜索电话价格。

我在 2.7 中进行了本地测试,它运行良好,生成了一个包含表格和所有价格的 html 页面 (celly.html)。当我上传它时,它会很好地生成 html,但拒绝向我的价目表添加任何内容 ([intprices])。

我的怀疑:(a) 因为它在本地运行良好,PythonAnywhere 不允许它与 craigslist 通信;或 (b) 因为我像穴居人那样做而不是使用微框架,PythonAnywhere 拒绝了我;或 (c) 我对自己的错误视而不见,并且错过了一些明显的东西。

我的 python 脚本位于/home/tseymour/mysite,而 html 是在 same/mysite/static/celly.html 生成的。该文件在 http://tseymour.pythonanywhere.com/static/celly.html 提供。

您会注意到我所有的单元格都填有“N/A”,这意味着它在 SearchPhone.py 中的 try:"中引发了一个 IndexError。这意味着我的列表正在被填充!

但为什么会这样?!我相信这是因为我是 PythonAnywhere n00b。

请指教。

搜索电话.py

from BeautifulSoup import BeautifulSoup
import urllib
import re

def SearchPhone(phone):

y = "http://losangeles.craigslist.org/search/moa?query=" + phone + "+-%22buy%22+-%22fix%22+-%22unlock%22+-%22broken%22+-%22cracked%22+-%22parts%22&srchType=T&minAsk=&maxAsk="

site = urllib.urlopen(y)
html = site.read()
site.close()
soup = BeautifulSoup(html)


prices = soup.findAll("span", {"class":"itempp"})
prices = [str(j).strip('<span class="itempp"> $</span>') for j in prices]

for k in prices[:]:
if k == '': #left price blank
prices.remove(k)
elif int(k) <= 75: #less than $50: probably a service (or not true)
prices.remove(k)
elif int(k) >= 999: #probably not true
prices.remove(k)

#Find Average Price
intprices = []
newprices = prices[:]
total = 0
for k in newprices:
total += int(k)
intprices.append(int(k))

intprices = sorted(intprices)

try:
del intprices[0]
del intprices[-1]


avg = total/len(newprices)
low = intprices[0]
high = intprices[-1]

if len(intprices) % 2 == 1:
median = intprices[(len(intprices)+1)/2-1]
else:
lower = intprices[len(intprices)/2-1]
upper = intprices[len(intprices)/2]
median = (float(lower + upper)) / 2



namestr = str(phone)
medstr = "Median: $" + str(median)
avgstr = "Average: $" + str(avg)
lowstr = "Low: $" + str(intprices[0])
highstr = "High: $" + str(intprices[-1])
samplestr = "# of samples: " + str(len(intprices))
linestr = "-------------------------------"

except IndexError:
namestr = str(phone)
medstr = "N/A"
avgstr = "N/A"
lowstr = "N/A"
highstr = "N/A"
samplestr = "N/A"
linestr = "-------------------------------"

return (namestr, medstr, avgstr, lowstr, highstr, samplestr, linestr)

手机.py

from SearchPhone import SearchPhone

phones = ["Iphone 4", "Iphone 5","Galaxy s3", "Galaxy s2", "LG Lucid", "LG Esteem", "HTC One S", "Droid 4",
"Droid RAZR MAXX", "HTC EVO", "Galaxy Nexus", "LG Optimus 2", "LG Ignite",
"Galaxy Note", "HTC Amaze", "HTC Rezound", "HTC Vivid", "HTC Rhyme", "Motorola Photon",
"Motorola Milestone", "myTouch slide", "HTC Status", "Droid 3", "HTC Evo 3d", "HTC Wildfire",
"LG Optimus 3d", "HTC ThunderBolt", "Incredible 2", "Kyocera Echo", "Galaxy S 4g",
"HTC Inspire", "LG Optimus 2x", "Samsung Gem", "HTC Evo Shift", "Nexus S", "LG Axis", "Droid 2",
"G2", "Droid x", "Droid Incredible"
]

f = open('/home/tseymour/mysite/static/celly.html','w')


f.write("""<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Celly Blue Book</title>
</head>

<body>
</body>
</html>
""")

#table
f.write('<table width="100%" border="1">')
for x in phones:
print "SEarchphone0"
y = SearchPhone(x)
print "SEarchphone"
f.write( "\t<tr>")
f.write( "\t\t<td>" + str(y[0]) + "</td>")
f.write( "\t\t<td>" + str(y[1]) + "</td>")
f.write( "\t\t<td>" + str(y[2]) + "</td>")
f.write( "\t\t<td>" + str(y[3]) + "</td>")
f.write( "\t\t<td>" + str(y[4]) + "</td>")
f.write( "\t</tr>")

f.write('</table>')

f.close()

此外,我确实上传了 beautifulsoup 以防万一

最佳答案

PythonAnywhere 开发者在这里。您没有说您使用的是免费帐户还是付费 PythonAnywhere 帐户,但如果它是免费帐户,那么我认为您正在进入我们的白名单。对于免费帐户,我们只允许访问一组特定的网站——这是因为人们利用我们做坏事。

我们将网站放在白名单上,这样免费帐户就可以使用它们,如果它们有一个官方的公开访问的 API,不幸的是 Craigslist 没有 - quite the opposite, unfortunately .

如果您注册付费帐户,那么您可能可以做您想做的事,但如果我刚刚链接到的文章是正确的,那么您可能需要确保您有优秀的律师...

关于python - 无法让 PythonAnywhere 为我抓取网络,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13700415/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com