gpt4 book ai didi

python - BeautifulSoup使用独特的CSS选择器

转载 作者:行者123 更新时间:2023-12-01 03:41:45 24 4
gpt4 key购买 nike

从此page ,我需要从“Anbindung an das Telefonnetz”获取状态。

我确定了两种获取它的方法:

  1. 如果状态包含“Das System arbeitet einwandfrei”句子;
  2. 如果背景颜色是绿色。

我选择了第一个选项。

我使用 Python/BeautifulSoup 来抓取页面。问题是,没有唯一的 id/class 或任何东西来获取这个元素。
然后我决定使用这个特定元素的 CSS 选择器,如下所示:

div.system-item:nth-child(2) > div:nth-child(1) > p:nth-child(3)

像这样使用它:

print(page.select("div.system-item:nth-child(2) > div:nth-child(1) > p:nth-child(3)"))

但是,我得到的唯一结果是一个空元素([])。

我可以尝试什么来获得这个特定的元素?

编辑
正如你们中的一些人所推荐的,这里是该页面的不完整 HTML 源代码。
。但为了实用,我建议您自己看看page

<!doctype html>
<head>
<meta charset="utf-8">

<title>Aktueller Status | Placetel</title>

<meta http-equiv="X-UA-Compatible" content="IE=Edge">
<meta name="msvalidate.01" content="756F6E40DD887A659CE83E5A92FFBB62">
<meta name="viewport" content="width=device-width, initial-scale=1.0">

<meta name="generator" content="Kirby 2.3.2">

<meta name="description" content="Placetel Systemstatus: Erfahren Sie mehr &uuml;ber den aktuellen Status der Placetel Telefonanlage.">
<meta name="keywords" content="">

<meta name="robots" content="index,follow,noodp,noydir">

<link rel="canonical" href="https://www.placetel.de/status">
<link rel="publisher" href="https://plus.google.com/b/111027512373770716962/111027512373770716962/posts">

<link rel="shortcut icon" href="/favicon.ico">
<link rel="apple-touch-icon" href="/apple-touch-icon.png">
<meta name="msapplication-TileColor" content="#0e70b9">
<meta name="msapplication-TileImage" content="/ms-tile-icon.png">
<meta name="theme-color" content="#0e70b9">

<script src="//use.typekit.net/rnw8lad.js"></script>
<script>try { Typekit.load({ async: true }); } catch (e) {}</script>

<link rel="stylesheet" href="https://www.placetel.de/assets/dist/css/main.css"> <script src="https://www.placetel.de/assets/dist/js/modernizr.js"></script>
<link rel="dns-prefetch" href="//app.marketizator.com"/>
<script>
var _mktz = _mktz || [];
_mktz.cc_domain = 'placetel.de';
</script>
<script type="text/javascript" src="//d2tgfbvjf3q6hn.cloudfront.net/js/o17fe41.js"></script>
</head>
<body id="🚀" class="page page-template-page-sections page-uid-status">

<script>
var gaProperty = 'UA-17631409-3';
var disableStr = 'ga-disable-' + gaProperty;
if (document.cookie.indexOf(disableStr + '=true') > -1) {
window[disableStr] = true;
}
function gaOptout() {
document.cookie = disableStr + '=true; expires=Thu, 31 Dec 2099 23:59:59 UTC; path=/';
window[disableStr] = true;
}
</script>

<!-- Google Tag Manager -->
<noscript><iframe src="//www.googletagmanager.com/ns.html?id=GTM-KDNGCC"
height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript>
<script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'//www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-KDNGCC');</script>
<!-- End Google Tag Manager -->
<header class="header header-condensed" id="header">
<div class="container-fluid">

<nav class="navigation navigation-top">
<ul>
<li class=" ">
<a title="Unternehmen" href="https://www.placetel.de/unternehmen">

<span>Unternehmen</span>
</a>
</li>
<li class=" ">
<a title="Partner werden" href="https://www.placetel.de/partner">

<span>Partner werden</span>
</a>
</li>
<li class=" ">
<a title="Support" href="https://www.placetel.de/support">

<span>Support</span>
</a>
</li>
<li class=" ">
<a title="Suche" href="javascript:modal('search')">

<span>Suche</span>
</a>
</li>
<li class="navigation-top-support">
<a href="https://www.placetel.de/support">
<svg class="svg-phone"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-phone"></use></svg> <span>0221 29 191 999</span>
</a>
</li>
<li class="navigation-top-login">
<a href="https://app.placetel.de/account/login">
<span>Login</span>
</a>
</li>
</ul>
</nav> </div>

<div class="container-fluid">
<a class="site-logo" href="https://www.placetel.de">
<svg class="svg-placetel-logo"><title>Placetel</title> <use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-placetel-logo"></use></svg> </a>

<nav class="navigation navigation-main" id="navigation-main">
<ul>

<li class="has-sub-navigation">
<a title="Telefonanlage" href="https://www.placetel.de/telefonanlage"
class="">
<span>Telefonanlage</span>

<svg class="svg-arrow"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-arrow"></use></svg> </a>

<nav class="sub-navigation">
<ul>
<li class="">
<a href="https://www.placetel.de/telefonanlage">
Vorteile </a>
</li>
<li class="">
<a href="https://www.placetel.de/telefonanlage/preise">
Preise </a>
</li>
<li class="">
<a href="https://www.placetel.de/telefonanlage/funktionen">
Funktionen </a>
</li>
<li class="">
<a href="https://www.placetel.de/telefonanlage/unified-communication">
Unified Communication </a>
</li>
<li class="">
<a href="https://www.placetel.de/telefonanlage/funktionsweise">
Wie funktioniert es? </a>
</li>
<li class="">
<a href="https://www.placetel.de/telefonanlage/isdn-abschaltung">
ISDN-Abschaltung </a>
</li>
<li class="">
<a href="https://www.placetel.de/telefonanlage/faq">
FAQ </a>
</li>
</ul>
</nav>
</li>

<li class="">
<a title="Trunking" href="https://www.placetel.de/sip-trunking"
class="">
<span>Trunking</span>

</a>

</li>

<li class="">
<a title="Mobilfunk" href="https://www.placetel.de/mobilfunk"
class="">
<span>Mobilfunk</span>

</a>

</li>

<li class="navigation-main-shop">
<a title="Endger&auml;te-Shop" href="/shop/"
class="">
<span>Endger&auml;te-Shop</span>

</a>

</li>

<li class="visible-xs-block visible-sm-block">
<a title="Support" href="https://www.placetel.de/support"
class="">
<span>Support</span>

</a>

</li>

<li class="visible-xs-block visible-sm-block">
<a title="Partner" href="https://www.placetel.de/partner"
class="">
<span>Partner</span>

</a>

</li>

<li class="has-sub-navigation visible-xs-block visible-sm-block">
<a title="Unternehmen" href="https://www.placetel.de/unternehmen"
class="">
<span>Unternehmen</span>

<svg class="svg-arrow"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-arrow"></use></svg> </a>

<nav class="sub-navigation">
<ul>
<li class="">
<a href="https://www.placetel.de/unternehmen">
&Uuml;ber uns </a>
</li>
<li class="">
<a href="https://www.placetel.de/unternehmen/technologie">
Technologie </a>
</li>
<li class="">
<a href="https://www.placetel.de/unternehmen/jobs">
Jobs </a>
</li>
<li class="">
<a href="https://www.placetel.de/unternehmen/events">
Events </a>
</li>
<li class="">
<a href="https://www.placetel.de/unternehmen/presse">
Presse </a>
</li>
<li class="">
<a href="https://www.placetel.de/unternehmen/kontakt">
Kontakt </a>
</li>
</ul>
</nav>
</li>

<li class="navigation-main-register">
<a title="Kostenlos testen!" href="javascript:modal('register')"
class="btn">
<span>Kostenlos testen!</span>

</a>

</li>
</ul>
</nav>
<a class="site-navigation-toggle" id="hotdog">
<i>
<span></span>
</i> Menü
</a>
</div>
</header>


<section class="section section-full section-full-section-einleitung-text section-full-normal">
<div class="container-fluid typography typography-dark">
<h2 class="section-full-title">Der Placetel System Status</h2>

<h3 class="section-full-subtitle">Jeden Tag einen Grund zur Freude.</h3>

<p>Wir bei Placetel haben ein Lieblingswort: „läuft“. Der Grund: Ihre Placetel Telefonanlage funktioniert nämlich immer. Darüber freuen wir uns natürlich riesig. Da aber erst eine geteilte Freude eine richtige Freude ist, haben wir Ihnen diese Statusseite eingerichtet. Diese Seite informiert Sie jeden Tag über den einwandfreien Status Ihrer Anlage.<br />
Und falls etwas mal nicht so perfekt funktionieren sollte wie gewohnt, können Sie uns den Fehler gern melden.</p>
</div>

<style>
.section-full-section-einleitung-text {
background-color: ;
}
</style>

</section>

<section class="section section-system">
<a class="btn btn-primary btn-transparent btn-with-icon" href="javascript:location.reload();">
<svg class="svg-refresh"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-refresh"></use></svg> Status aktualisieren
</a>

<div class="system flex-grid typography typography-light">
<div class="system-item system-item-green flex-grid-item">
<div class="system-item-inner">
<h6>
System </h6>

<i>
<svg class="svg-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-included"></use></svg> <svg class="svg-dots"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-dots"></use></svg> <svg class="svg-not-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-not-included"></use></svg> </i>

<p>
Das System arbeitet einwandfrei<br>
11:10 Uhr
</p>

</div>
</div>

<div class="system-item system-item-green flex-grid-item">
<div class="system-item-inner">
<h6>
Anbindung an das Telefonnetz </h6>

<i>
<svg class="svg-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-included"></use></svg> <svg class="svg-dots"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-dots"></use></svg> <svg class="svg-not-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-not-included"></use></svg> </i>

<p>
Das System arbeitet einwandfrei<br>
11:10 Uhr
</p>

</div>
</div>

<div class="system-item system-item-green flex-grid-item">
<div class="system-item-inner">
<h6>
Faxsystem </h6>

<i>
<svg class="svg-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-included"></use></svg> <svg class="svg-dots"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-dots"></use></svg> <svg class="svg-not-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-not-included"></use></svg> </i>

<p>
Das System arbeitet einwandfrei<br>
11:10 Uhr
</p>

</div>
</div>

<div class="system-item system-item-green flex-grid-item">
<div class="system-item-inner">
<h6>
Konferenzsystem </h6>

<i>
<svg class="svg-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-included"></use></svg> <svg class="svg-dots"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-dots"></use></svg> <svg class="svg-not-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-not-included"></use></svg> </i>

<p>
Das System arbeitet einwandfrei<br>
11:10 Uhr
</p>

</div>
</div>

<div class="system-item system-item-green flex-grid-item">
<div class="system-item-inner">
<h6>
Features und Optionen </h6>

<i>
<svg class="svg-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-included"></use></svg> <svg class="svg-dots"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-dots"></use></svg> <svg class="svg-not-included"><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.placetel.de/assets/dist/sprites/svg/sprite.1471515912.svg#svg-not-included"></use></svg> </i>

<p>
Das System arbeitet einwandfrei<br>
11:10 Uhr
</p>

</div>
</div>
</div>
</section>

</body>
</html>

最佳答案

据我所知nth-of-childBeautifulSoup4 中仍未实现。另外,如果您调查网站的 CSS(即 _system.scss 文件),您会发现有 3 种状态:

  1. 系统元素绿色
  2. 系统元素黄色
  3. 系统元素红色

因此,您可能需要稍微更改代码,如下所示:

import requests
from bs4 import BeautifulSoup as BS

url = 'https://www.placetel.de/status'
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Firefox/31.0'
}
source = requests.get(url, headers=headers)
soup = BS(source.text, 'html.parser')

status = soup.select("div.system-item")[1].attrs['class']

if 'system-item-green' in status:
print("It works!")
elif 'system-item-yellow' in status:
print("Something's slightly wrong")
elif 'system-item-red' in status:
print("Does not work")
else:
print("Has someone changed page's markup?")

关于python - BeautifulSoup使用独特的CSS选择器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39508252/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com