gpt4 book ai didi

python - 使用 Python 下载 *.mp4 文件

转载 作者:行者123 更新时间:2023-11-28 21:56:54 24 4
gpt4 key购买 nike

我正在尝试从网站下载并保存讲座视频。虽然我已成功下载文件,但它们无法在我的媒体播放器中播放。这是我正在使用的代码:

from bs4 import BeautifulSoup
import re
import urllib2

snippet = open('Python/SNA Page Source Revised.txt', 'r')
soup = BeautifulSoup(snippet)

links = [link.get('href') for link in soup.find_all('a')]

videos = []

for link in links:
match = re.search('.*mp4.*', link)
if match:
videos.append(link)

vidNum = 1

for video in videos:
f = urllib2.urlopen(video)
with open('Data Analysis/Social Network Analysis/Video '+vidNum+'.mp4', 'wb') as code:
code.write(f.read())
vidNum += 1

一切似乎都正常,但当我尝试播放其中一个视频时,出现此错误:“Python (v2.7) 需要安装插件才能播放以下类型的媒体文件:text/html 解码器” 另外,如果我从网站手动下载视频,文件大约为 22.8MB,但是当我使用我的脚本,文件只有7.8kB。

我下载文件的方式有问题吗?任何帮助将不胜感激。

此外:我正在使用 Python v2.7 在 Ubuntu 12.04 LTS 操作系统上运行。

****编辑****

这是我根据收到的回复使用的代码:

import requests

r = requests.get('https://class.coursera.org/sna-003/lecture/download.mp4?lecture_id=2', auth=('myUsername', 'myPassword'))

with open('Data Analysis/TestFile.mp4', 'wb') as fd:
fd.write(r.content)

这里是 r.content 的输出:

<!DOCTYPE html>
<html itemtype="http://schema.org" xmlns:fb="http://ogp.me/ns/fb#"><head><meta content="IE=Edge,chrome=IE7" http-equiv="X-UA-Compatible"/><meta content="!" name="fragment"/><meta content="NOODP" name="robots"/><meta charset="utf-8"/><meta content="Coursera" property="og:title"/><meta content="website" property="og:type"/><meta content="http://s3.amazonaws.com/coursera/media/Coursera_Computer_Narrow.png" property="og:image"/><meta content="https://www.coursera.org/" property="og:url"/><meta content="Coursera" property="og:site_name"/><meta content="en_US" property="og:locale"/><meta content="Take free online classes from 80+ top universities and organizations. Coursera is a social entrepreneurship company partnering with Stanford University, Yale University, Princeton University and others around the world to offer courses online for anyone to take, for free. We believe in connecting people to a great education so that anyone around the world can learn without limits." property="og:description"/><meta content="727836538,4807654" property="fb:admins"/><meta content="274998519252278" property="fb:app_id"/><meta content="Take free online classes from 80+ top universities and organizations. Coursera is a social entrepreneurship company partnering with Stanford University, Yale University, Princeton University and others around the world to offer courses online for anyone to take, for free. We believe in connecting people to a great education so that anyone around the world can learn without limits." name="description"/><meta content="http://s3.amazonaws.com/coursera/media/Coursera_Computer_Narrow.png" name="image"/><meta content="app-id=736535961" name="apple-itunes-app"/><script>window.onerror = function(message, url, lineNum) {

// First check the URL and line number of the error
url = url || window.location.href;
// 99% of the time, errors without line numbers arent due to our code,
// they are due to third party plugins and browser extensions
if (lineNum === undefined || lineNum == null) return;

// Now figure out the actual error message
// If it's an event, as triggered in several browsers
if (message.target &amp;&amp; message.type) {
message = message.type;
}
if (!message.indexOf) {
message = 'Non-string, non-event error: ' + (typeof message);
}

var errorDescrip = {
message: message,
script: url,
line: lineNum,
url: document.URL
}

var err = {
key: 'page.error.javascript',
value: errorDescrip
}

window._204 = window._204 || [];
window._204.push(err);

window._gaq = window._gaq || [];
window._gaq.push(err);
}</script><title>Coursera.org</title><link href="https://d1rlkby5e91r2j.cloudfront.net/e47434615f57601f9b9ccaf255a589e8550d328d/css/home.css" rel="stylesheet" type="text/css"/><link href="https://d1rlkby5e91r2j.cloudfront.net/e47434615f57601f9b9ccaf255a589e8550d328d/pages/auth/css/auth.css" rel="stylesheet" type="text/css"/><script data-baseurl="https://d1rlkby5e91r2j.cloudfront.net/e47434615f57601f9b9ccaf255a589e8550d328d/" id="_mobile">(function(el) {
// Override certian behaviour if the page is for our mobile app.
// TODO(priya) Remove this conditional behaviour once I want to push this behaviour
// for regular authentication pages on mobile/smaller screens as well.
// Currently I'm keeping existing behaviour same and only adding mobile specific
// layouts ot /mobilesignup page (which is what isMobileApp = true signifies).
if ("false" == "true") {
var head = document.getElementsByTagName('head')[0];
// Add viewport meta tag
var viewport = document.querySelector('meta[name=viewport]');
var viewportContent = 'width=device-width, initial-scale=1.0, user-scalable=no';
if (!viewport) {
viewport = document.createElement('meta');
viewport.setAttribute('name', 'viewport');
head.appendChild(viewport);
}
viewport.setAttribute('content', viewportContent);

// Add responsive css
var link = document.createElement('link');
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = el.getAttribute("data-baseurl") + "pages/auth/css/auth_responsive.css";
head.appendChild(link);
}
})(document.getElementById("_mobile"));
</script></head><body><div id="fb-root"></div><div id="origami"><div style="position:absolute;top:0px;left:0px;width:100%;height:100%;background:#f5f5f5;padding-top:5%;"><div id="coursera-loading-nojs" style="text-align:center; margin-bottom:10px;display:none;">Please use a <a href="/browsers">modern browser </a> with JavaScript enabled to use Coursera.</div><div><span id="coursera-loading-js" style="display: none; padding-left:45%">loading   <img src="https://d2wvvaown1ul17.cloudfront.net/site-static/images/icons/loading.gif"/></span></div><noscript><div style="text-align:center; margin-bottom:10px;">Please use a <a href="/browsers">modern browser </a> with JavaScript enabled to use Coursera.</div></noscript></div></div><!--[if gte IE 8]&gt;&lt;script&gt;document.getElementById("coursera-loading-js").style.display = 'block';&lt;/script&gt;&lt;![endif]-->
<!--[if lte IE 7]&gt;&lt;script&gt;document.getElementById("coursera-loading-nojs").style.display = 'block';
window._204 = window._204 || [];
window._gaq = window._gaq || [];

window._gaq.push(
['_setAccount', 'UA-28377374-1'],
['_setDomainName', window.location.hostname],
['_setAllowLinker', true],
['_trackPageview', window.location.pathname]);

window._204.push(
['client', 'home'],
{key:"pageview", value:window.location.pathname});
&lt;/script&gt;&lt;script src="https://eventing.coursera.org/204.min.js"&gt;&lt;/script&gt;&lt;script src="https://ssl.google-analytics.com/ga.js"&gt;&lt;/script&gt;&lt;![endif]-->
<!--[if !IE]&gt; --><script>document.getElementById("coursera-loading-js").style.display = 'block';</script><!-- &lt;![endif]--><script src="https://d1rlkby5e91r2j.cloudfront.net/e47434615f57601f9b9ccaf255a589e8550d328d/js/core/require.js" type="text/javascript"></script><script data-baseurl="https://d1rlkby5e91r2j.cloudfront.net/e47434615f57601f9b9ccaf255a589e8550d328d/" data-debug="0" data-locale="" data-timestamp="1386838999742" data-version="e47434615f57601f9b9ccaf255a589e8550d328d" id="_require" type="text/javascript">if(document.getElementById("coursera-loading-js").style.display == 'block') {
(function(el) {
// prevent throw
require.onError = function(err) {
window._204 = window._204 || [];
window._204.push({key: 'requireErr', value: err});
};

define("pages/auth/authConfig",
function() {
return {"coursera_url": "https://www.coursera.org/",
"environment": "production"};
}
);

require.config({
enforceDefine: false,
waitSeconds: 14,
baseUrl: el.getAttribute("data-baseurl"),
urlArgs: el.getAttribute("data-debug") == "1" ? "v=" + el.getAttribute("data-timestamp") : "",
shim: {
"underscore": {
exports: '_'
},
"backbone": {
deps: ['underscore', 'jquery'],
exports: 'Backbone'
}
},
paths: {
"jquery": "js/core/jquery",
"underscore": "js/core/underscore",
"backbone": "js/core/backbone",
"i18n": "js/core/i18n._t"
},
callback: function() {
require(["pages/auth/routes"]); // bootup coursera
},
config: {
i18n: {
locale: (window.localStorage ? localStorage.getItem("locale") : '') || el.getAttribute("data-locale")
}
}
});
})(document.getElementById("_require"));
}</script><script type="text/javascript">define("pages/home/models/user.json", [], function(){
return null;
});
</script></body></html>

虽然我觉得这很奇怪,因为它看起来就像网站的源代码,但是当我查看 r.url 时,我得到了一个可以在浏览器中加载的实际网站,它会提示我保存或查看视频。即使我尝试传递我从中获得的新 url,我认为它包含我的 cookie 信息,我仍然会得到相同的内容。我不明白我哪里出错了。

最佳答案

首先,下载并安装 requests package .

然后使用这段代码:

import requests

def downloadfile(name,url):
name=name+".mp4"
r=requests.get('url')
print "****Connected****"
f=open(name,'wb');
print "Donloading....."
for chunk in r.iter_content(chunk_size=255):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
print "Done"
f.close()

关于python - 使用 Python 下载 *.mp4 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20723538/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com