gpt4 book ai didi

python - 如何在Beautiful Soup中找到所有段落中的所有链接

转载 作者:行者123 更新时间:2023-11-28 22:49:49 25 4
gpt4 key购买 nike

假设您将此输入到 python 解释器中:

from urllib import request
from bs4 import BeautifulSoup
soup = BeautifulSoup(request.urlopen("http://en.wikipedia.org/wiki/Python_(programming_language)").read())
a = soup.find_all('p')
b = a.find_all('href')

我希望 b 是段落中所有链接的列表,但是,它给出了一个属性错误,其中 a 是“ResultSet”并且没有属性“find_all”。如何使用 BeautifulSoup 找到段落中的所有链接?

最佳答案

soup.find_all('p')返回一个列表;您必须遍历它才能在每个结果段落标记中找到链接。

但是,如果您使用 CSS selector 会更容易在一个操作中搜索所有链接:

all_links = soup.select('p a[href]')

这会找到所有 <a>里面的标签 <p>标签,并将搜索限制为仅具有 href 的那些属性。您可以使用列表理解提取链接:

all_links = [tag['href'] for tag in soup.select('p a[href]')]

生成仅包含链接的列表。

演示:

>>> from urllib import request
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(request.urlopen("http://en.wikipedia.org/wiki/Python_(programming_language)").read())
>>> [tag['href'] for tag in soup.select('p a[href]')]
['/wiki/General-purpose_programming_language', '/wiki/High-level_programming_language', '#cite_note-AutoNT-34-15', '#cite_note-16', '#cite_note-17', '/wiki/Readability', '/wiki/Lines_of_code', '/wiki/C_(programming_language)', '#cite_note-Summerfield-18', '#cite_note-19', '#cite_note-AutoNT-7-20', '/wiki/Programming_paradigm', '/wiki/Object-oriented_programming', '/wiki/Imperative_programming', '/wiki/Functional_programming', '/wiki/Procedural_programming', '/wiki/Dynamic_type', '/wiki/Memory_management', '/wiki/Standard_library', '#cite_note-About-21', '/wiki/Dynamic_language', '/wiki/Scripting_language', '/wiki/Py2exe', '/w/index.php?title=Pyinstaller&action=edit&redlink=1', '#cite_note-22', '/wiki/CPython', '/wiki/Reference_implementation', '/wiki/Free_and_open_source_software', '/wiki/Python_Software_Foundation', '#cite_note-venners-interview-pt-1-23', '#cite_note-timeline-of-python-24', '/wiki/Guido_van_Rossum', '/wiki/Centrum_Wiskunde_%26_Informatica', '/wiki/Netherlands', '/wiki/ABC_(programming_language)', '/wiki/SETL', '#cite_note-AutoNT-12-25', '/wiki/Exception_handling', '/wiki/Amoeba_(operating_system)', '#cite_note-faq-created-5', '/wiki/Benevolent_Dictator_for_Life', '/wiki/Garbage_collection_(computer_science)', '/wiki/Unicode', '#cite_note-newin-2.0-26', '#cite_note-3.0-release-27', '/wiki/Backporting', '#cite_note-pep-3000-28', '/wiki/Multi-paradigm_programming_language', '/wiki/Object-oriented_programming', '/wiki/Structured_programming', '/wiki/Functional_programming', '/wiki/Aspect-oriented_programming', '/wiki/Metaprogramming', '#cite_note-AutoNT-13-29', '/wiki/Metaobject', '#cite_note-AutoNT-14-30', '/wiki/Design_by_contract', '#cite_note-AutoNT-15-31', '#cite_note-AutoNT-16-32', '/wiki/Logic_programming', '#cite_note-AutoNT-17-33', '/wiki/Dynamic_typing', '/wiki/Reference_counting', '/wiki/Garbage_collection_(computer_science)', '/wiki/Memory_management', '/wiki/Name_resolution', '/wiki/Late_binding', '/wiki/Functional_programming', '/wiki/Lisp_(programming_language)', '/wiki/List_comprehension', '/wiki/Associative_array', '#cite_note-AutoNT-59-34', '/wiki/Haskell_(programming_language)', '/wiki/Standard_ML', '#cite_note-AutoNT-18-35', '/wiki/Aphorism', '#cite_note-PEP20-36', '/wiki/ABC_(programming_language)', '#cite_note-venners-interview-pt-1-23', '/wiki/Perl', '/wiki/Alex_Martelli', '#cite_note-AutoNT-19-37', '/wiki/There_is_more_than_one_way_to_do_it', '#cite_note-PEP20-36', '/wiki/Premature_optimization', '#cite_note-AutoNT-20-38', '/wiki/PyPy', '/wiki/Just-in-time_compilation', '/wiki/Cython', '/wiki/Monty_Python', '#cite_note-39', '/wiki/Foobar', '#cite_note-40', '#cite_note-41', '/wiki/Neologism', '#cite_note-AutoNT-27-42', '#cite_note-AutoNT-25-43', '/wiki/C_(programming_language)', '/wiki/Pascal_(programming_language)', '#cite_note-AutoNT-52-44', '/wiki/Whitespace_character', '/wiki/Curly_bracket_programming_language', '/wiki/Block_(programming)', '/wiki/Off-side_rule', '#cite_note-AutoNT-53-45', '#cite_note-46', '#cite_note-AutoNT-54-47', '/wiki/Tail_call', '/wiki/First-class_continuations', '#cite_note-AutoNT-55-49', '#cite_note-AutoNT-56-50', '/wiki/Coroutine', '/wiki/Generator_(computer_science)', '#cite_note-AutoNT-57-51', '/wiki/Lazy_evaluation', '/wiki/Iterator', '#cite_note-AutoNT-58-52', '/wiki/C_(programming_language)', '/wiki/Java_(programming_language)', '/wiki/Common_Lisp', '/wiki/Scheme_(programming_language)', '/wiki/Ruby_(programming_language)', '/wiki/Lambda_expressions', '/wiki/Method_(programming)', '/wiki/Function_(programming)', '/wiki/Syntactic_sugar', '/wiki/This_(computer_programming)', '/wiki/Instance_data', '/wiki/C%2B%2B', '/wiki/Java_(programming_language)', '/wiki/Objective-C', '/wiki/Ruby_(programming_language)', '#cite_note-AutoNT-61-54', '/wiki/Duck_typing', '/wiki/Compile_time', '/wiki/Dynamic_programming_language', '/wiki/Strongly_typed_programming_language', '/wiki/Class_(computer_science)', '/wiki/Object-oriented_programming', '/wiki/Object_(computer_science)', '/wiki/Metaclass', '/wiki/Metaprogramming', '/wiki/Reflection_(computer_science)', '#cite_note-classy-55', '#cite_note-pep0238-57', '/wiki/Half-open_interval', '#cite_note-AutoNT-62-58', '/wiki/Rounding', '#cite_note-AutoNT-63-59', '/wiki/Round_to_even', '#cite_note-AutoNT-64-60', '#cite_note-AutoNT-65-61', '/wiki/Wikipedia:Citing_sources', '/wiki/Standard_library', '#cite_note-AutoNT-86-62', '#cite_note-About-21', '/wiki/MIME', '/wiki/Hypertext_Transfer_Protocol', '/wiki/Graphical_user_interface', '/wiki/Relational_database', '#cite_note-AutoNT-88-63', '/wiki/Regular_expression', '/wiki/Unit_testing', '/wiki/Web_Server_Gateway_Interface', '#cite_note-AutoNT-89-64', '/wiki/Python_Package_Index', '/wiki/Command_line_interpreter', '/wiki/Command-line_interface', '/wiki/IDLE_(Python)', '/wiki/IPython', '/wiki/Python_IDE', '/wiki/Web_browser', '/wiki/Sage_(mathematics_software)', '/wiki/PythonAnywhere', '/wiki/CPython', '/wiki/C_(programming_language)', '/wiki/C89_(C_version)', '#cite_note-AutoNT-66-65', '/wiki/Bytecode', '#cite_note-AutoNT-67-66', '/wiki/Virtual_machine', '#cite_note-AutoNT-68-67', '/wiki/Microsoft_Windows', '/wiki/Unix-like', '#cite_note-AutoNT-69-68', '/wiki/PyPy', '#cite_note-AutoNT-70-69', '/wiki/Just-in-time_compilation', '#cite_note-AutoNT-71-70', '/wiki/Multi-core_processor', '/wiki/Software_transactional_memory', '#cite_note-AutoNT-72-71', '/wiki/Stackless_Python', '/wiki/Microthread', '#cite_note-AutoNT-73-72', '/wiki/Nokia', '/wiki/Series_60', '/wiki/PyS60', '/wiki/Symbian', '/wiki/N900', '/wiki/GTK', '/wiki/Wikipedia:Citation_needed', '/wiki/Object_language', '#cite_note-PepCite000-74', '/wiki/BDFL', '#cite_note-PepCite000-74', '/wiki/Roundup_(issue_tracker)', '/wiki/Bug_tracker', '#cite_note-AutoNT-21-75', '/wiki/Self-hosted', '/wiki/Mercurial', '#cite_note-py_dev_guide-76', '/wiki/Beta_release', '/wiki/Unit_test', '/wiki/BuildBot', '/wiki/Continuous_integration', '#cite_note-AutoNT-23-79', '/wiki/Python_Package_Index', '/wiki/Academic_conference', '/wiki/PyCon', '/wiki/Pyladies', '/wiki/Monty_Python%27s_Flying_Circus', '#cite_note-AutoNT-24-80', '#cite_note-tutorial-chapter1-81', '/wiki/Metasyntactic_variable', '/wiki/Spam_(Monty_Python)', '/wiki/Foobar', '#cite_note-tutorial-chapter1-81', '#cite_note-AutoNT-26-82', '/wiki/Pygame', '/wiki/Language_binding', '/wiki/Simple_DirectMedia_Layer', '/wiki/PyS60', '/wiki/Symbian', '/wiki/S60_(software_platform)', '/wiki/PyQt', '/wiki/PyGTK', '/wiki/Qt_(framework)', '/wiki/GTK', '/wiki/PyPy', '/wiki/TIOBE_Programming_Community_Index', '#cite_note-AutoNT-34-15', '/wiki/Syntax_(programming_languages)', '/wiki/C_(programming_language)', '/wiki/C_syntax', '#cite_note-AutoNT-28-83', '/wiki/Google', '#cite_note-quotes-about-python-84', '/wiki/Yahoo!', '#cite_note-AutoNT-29-85', '/wiki/CERN', '#cite_note-AutoNT-30-86', '/wiki/NASA', '#cite_note-AutoNT-31-87', '/wiki/Industrial_Light_%26_Magic', '#cite_note-AutoNT-32-88', '/wiki/ITA_Software', '#cite_note-AutoNT-33-89', '/wiki/Scripting_language', '/wiki/Web_application', '/wiki/Mod_wsgi', '/wiki/Apache_web_server', '#cite_note-AutoNT-35-90', '/wiki/Web_Server_Gateway_Interface', '/wiki/Web_application_framework', '/wiki/Django_(web_framework)', '/wiki/Pylons_project', '/wiki/Pyramid_(web_framework)', '/wiki/TurboGears', '/wiki/Web2py', '/wiki/Tornado_(web_server)', '/wiki/Flask_(programming)', '/wiki/Zope', '/wiki/Pyjamas_(software)', '/wiki/IronPython', '/wiki/SQLAlchemy', '/wiki/Data_mapper_pattern', '/wiki/Twisted_(software)', '/wiki/Dropbox_(service)', '/wiki/NumPy', '/wiki/SciPy', '/wiki/Matplotlib', '/wiki/BioPython', '/wiki/Astropy', '/wiki/Sage_(mathematics_software)', '/wiki/Mathematical_software', '/wiki/Mathematics', '/wiki/Algebra', '/wiki/Combinatorics', '/wiki/Numerical_mathematics', '/wiki/Number_theory', '/wiki/Calculus', '/wiki/Finite_element_method', '/wiki/Abaqus', '/wiki/3ds_Max', '/wiki/Blender_(software)', '/wiki/Cinema_4D', '/wiki/Lightwave', '/wiki/Houdini_(software)', '/wiki/Maya_(software)', '/wiki/Modo_(software)', '/wiki/MotionBuilder', '/wiki/Softimage_XSI', '/wiki/Nuke_(software)', '/wiki/GIMP', '#cite_note-91', '/wiki/Inkscape', '/wiki/Scribus', '/wiki/Paint_Shop_Pro', '#cite_note-AutoNT-38-92', '/wiki/GNU_Debugger', '/wiki/Prettyprint', '/wiki/Esri', '/wiki/ArcGIS', '#cite_note-AutoNT-39-93', '#cite_note-AutoNT-40-94', '#cite_note-AutoNT-41-95', '/wiki/Programming_language', '/wiki/Google_App_Engine', '/wiki/Java_(software_platform)', '/wiki/Go_(programming_language)', '#cite_note-AutoNT-42-96', '/wiki/Artificial_intelligence', '#cite_note-AutoNT-43-97', '#cite_note-AutoNT-44-98', '#cite_note-AutoNT-45-99', '#cite_note-AutoNT-46-100', '/wiki/Natural_language_processing', '#cite_note-AutoNT-47-101', '/wiki/Linux_distribution', '/wiki/AmigaOS_4', '/wiki/FreeBSD', '/wiki/NetBSD', '/wiki/OpenBSD', '/wiki/OS_X', '/wiki/Ubuntu_(operating_system)', '/wiki/Ubiquity_(software)', '/wiki/Red_Hat_Linux', '/wiki/Fedora_(operating_system)', '/wiki/Anaconda_(installer)', '/wiki/Gentoo_Linux', '/wiki/Package_management_system', '/wiki/Portage_(software)', '/wiki/Pardus_(operating_system)', '#cite_note-AutoNT-48-102', '/wiki/Information_security', '#cite_note-AutoNT-49-103', '#cite_note-AutoNT-50-104', '/wiki/Sugar_(GUI)', '/wiki/One_Laptop_per_Child', '/wiki/Sugar_Labs', '#cite_note-AutoNT-51-105', '/wiki/Raspberry_Pi', '/wiki/Single-board_computer', '/wiki/LibreOffice', '#cite_note-106', '/wiki/Tcl', '#cite_note-AutoNT-99-115', '/wiki/Erlang_(programming_language)', '#cite_note-AutoNT-100-116', '/wiki/TIOBE_index', '#cite_note-AutoNT-101-117']

关于python - 如何在Beautiful Soup中找到所有段落中的所有链接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23373471/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com