gpt4 book ai didi

javascript - Python:从javascript按钮获取下载链接

转载 作者:行者123 更新时间:2023-11-28 23:03:51 25 4
gpt4 key购买 nike

我正在尝试让我的脚本从 www.subscene.com 下载字幕。问题是网页上的下载按钮是用java制作的,出于某种原因,即使我提取了URL也无法下载字幕。

我认为这是下载按钮的代码:

<a id="s_lc_bcr_downloadLink" class="downloadLink rating0" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;s$lc$bcr$downloadLink&quot;, &quot;&quot;, true, &quot;&quot;, &quot;/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx&quot;, false, true))">Download English Subtitle</a><a id="s_lc_bcr_previewLink" href="javascript:togglePreview(482407, 'zip');">(See preview)</a>

所以我提取 url 并告诉我的脚本下载它:

urllib.urlretrieve('http://subscene.com/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx','c:\\sub.zip')

(添加了“http://subscene.com”)

但由于某种原因,它没有下载正确的文件。我该怎么办?

编辑:

非常感谢!不幸的是我无法让它工作:(它说了以下内容

from selenium import webdriver

browser = webdriver.Firefox()
browser.execute_script('WebForm_DoPostBackWithOptions(newWebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))')

Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
browser.execute_script('WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))')
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\webdriver.py", line 385, in execute_script{'script': script, 'args':converted_args})['value']
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\webdriver.py", line 153, in execute
self.error_handler.check_response(response)
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\errorhandler.py", line 126, in check_response
raise exception_class(message, screen, stacktrace)
WebDriverException: Message: ''

最佳答案

正如 John 所说,这不是文件而是 javascript 代码。因此,您可以执行依次下载文件的 javascript,而不是使用 urllib.urlretrieve 获取该文件。这可以使用 selenium 模块来完成 -

from selenium import webdriver
browser = webdriver.Firefox()
browser.get('http://subscene.com/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407.aspx')
browser.execute_script('WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))')
raw_input()

我使用 Firebug 得到了这个 javascript 片段。

关于javascript - Python:从javascript按钮获取下载链接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8288202/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com