gpt4 book ai didi

python - 如何使用 beautifulsoup 和 python 只获取 mp3 链接

转载 作者:行者123 更新时间:2023-11-28 21:18:59 24 4
gpt4 key购买 nike

这是我的代码:

from bs4 import BeautifulSoup
import urllib.request
import re

url = urllib.request.urlopen("http://www.djmaza.info/Abhi-Toh-Party-Khubsoorat-Full-Song-MP3-2014-Singles.html")
content = url.read()
soup = BeautifulSoup(content)
for a in soup.findAll('a',href=True):
if re.findall('http',a['href']):
print ("URL:", a['href'])

这段代码的输出:

URL: http://twitter.com/mp3khan
URL: http://www.facebook.com/pages/MP3KhanCom-Music-Updates/233163530138863
URL: https://plus.google.com/114136514767143493258/posts
URL: http://www.djhungama.com
URL: http://www.djhungama.com
URL: http://songs.djmazadownload.com/music/Singles/Abhi Toh Party (Khoobsurat) -190Kbps [DJMaza.Info].mp3
URL: http://songs.djmazadownload.com/music/Singles/Abhi Toh Party (Khoobsurat) -190Kbps [DJMaza.Info].mp3
URL: http://songs.djmazadownload.com/music/Singles/Abhi Toh Party (Khoobsurat) -320Kbps [DJMaza.Info].mp3
URL: http://songs.djmazadownload.com/music/Singles/Abhi Toh Party (Khoobsurat) -320Kbps [DJMaza.Info].mp3
URL: http://www.htmlcommentbox.com
URL: http://www.djmaza.com
URL: http://www.djhungama.com

我只需要 .mp3 链接。

那么,我该如何重写代码呢?

谢谢

最佳答案

更改您的 findAll 以使用正则表达式进行匹配,例如:

for a in soup.findAll('a',href=re.compile('http.*\.mp3')):
print ("URL:", a['href'])

与评论相关的更新:

I need to store those links on an array for downloading . how can i do that ?

您可以改为使用列表理解来构建列表:

links = [a['href'] for a in soup.find_all('a',href=re.compile('http.*\.mp3'))]

关于python - 如何使用 beautifulsoup 和 python 只获取 mp3 链接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25564838/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com