gpt4 book ai didi

python mechanize - 从触发文件下载的 aspnetForm submitControl 检索文件

转载 作者:太空狗 更新时间:2023-10-30 01:23:30 26 4
gpt4 key购买 nike

当我不知道文件 URL 或文件名时,如何使用 python mechanize 从触发 Excel 文件下载的 aspnetForm submitControl 检索文件?

带有 Excel 文件的站点 URL:http://www.ncysaclassic.com/TTSchedules.aspx?tid=NCFL&year=2012&stid=NCFL&syear=2012&div=U11M01

我正在尝试通过“打印 Excel”“按钮”下载文件。

到目前为止我有:

r = br.open('http://www.ncysaclassic.com/TTSchedules.aspx?tid=NCFL&year=2012&stid=NCFL&syear=2012&div=U11M01')
html = r.read()

# Show the html title
print br.title()

# Show the available forms
for f in br.forms():
print f

br.select_form('aspnetForm')
print '\n\nSubmitting...\n'
br.submit("ctl00$ContentPlaceHolder1$btnExtractSched")

print 'Response...\n'
print br.response().info()
print br.response().read

print 'still alive...\n'

for prop, value in vars(br.response()).iteritems():
print 'Property:', prop, ', Value: ', value

print 'myfile...\n'

myfile = br.response().read

我得到了这个输出:

    Submitting...

Response...

Content-Type: application/vnd.ms-excel
Last-Modified: Thu, 27 Sep 2012 20:19:10 GMT
Accept-Ranges: bytes
ETag: W/"6e27615aed9ccd1:0"
Server: Microsoft-IIS/7.5
X-Powered-By: ASP.NET
Date: Thu, 27 Sep 2012 20:19:09 GMT
Connection: close
Content-Length: 691200

<bound method response_seek_wrapper.read of <response_seek_wrapper at 0x2db5248L whose wrapped object = <closeable_response at 0x2e811c8L whose fp = <socket._fileobject object at 0x0000000002D79930>>>>
still alive...

Property: _headers , Value: Content-Type: application/vnd.ms-excel
Last-Modified: Thu, 27 Sep 2012 20:19:10 GMT
Accept-Ranges: bytes
ETag: W/"6e27615aed9ccd1:0"
Server: Microsoft-IIS/7.5
X-Powered-By: ASP.NET
Date: Thu, 27 Sep 2012 20:19:09 GMT
Connection: close
Content-Length: 691200

Property: _seek_wrapper__read_complete_state , Value: [False]
Property: _seek_wrapper__have_readline , Value: True
Property: _seek_wrapper__is_closed_state , Value: [False]
Property: _seek_wrapper__pos , Value: 0
Property: wrapped , Value: <closeable_response at 0x2e811c8L whose fp = <socket._fileobject object at 0x0000000002D79930>>
Property: _seek_wrapper__cache , Value: <cStringIO.StringO object at 0x0000000002E8B0D8>

似乎我非常接近......注意内容类型:application/vnd.ms-excel

我只是不知道下一步该做什么。我的文件在哪里,如何获取指向它的指针并将其保存在本地以供以后访问?

更新:

我使用 dir() 获取 response() 的方法/属性列表,然后尝试了几个方法...

print '\ndir(br.response())\n'
for each in dir(br.response()):
print each

print '\nresponse info...\n'
print br.response().info()

print '\nresponse geturl\n'
print br.response().geturl()

我得到了这个输出...

dir(br.response())

__copy__
__doc__
__getattr__
__init__
__iter__
__module__
__repr__
__setattr__
_headers
_seek_wrapper__cache
_seek_wrapper__have_readline
_seek_wrapper__is_closed_state
_seek_wrapper__pos
_seek_wrapper__read_complete_state
close
get_data
geturl
info
invariant
next
read
readline
readlines
seek
set_data
tell
wrapped
xreadlines

response info...

Date: Thu, 27 Sep 2012 20:55:02 GMT
ETag: W/"fa759b5df29ccd1:0"
Server: Microsoft-IIS/7.5
Connection: Close
Content-Type: application/vnd.ms-excel
X-Powered-By: ASP.NET
Accept-Ranges: bytes
Last-Modified: Thu, 27 Sep 2012 20:55:03 GMT
Content-Length: 691200


response geturl

http://www.ncysaclassic.com/photos/pdftemp/ScheduleExcel165502.xls

我想我的 br.response 中已经有了这个文件。我只是不知道如何提取它!请帮忙。

最佳答案

# fill out the form
response = br.submit()
fileobj = open('filename', 'w+')
fileobj.write(response.read())
fileobj.close()

关于python mechanize - 从触发文件下载的 aspnetForm submitControl 检索文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12629470/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com