gpt4 book ai didi

python - 我的脚本无法从网页获取食品店的名称

转载 作者:行者123 更新时间:2023-11-28 18:06:14 24 4
gpt4 key购买 nike

我用 Python 创建了一个脚本,只从网页上抓取食品店的名称。但是,当我执行我的脚本时,出现以下错误。

name = soup.select_one("h1.listing-name").text
AttributeError: 'NoneType' object has no attribute 'text'

Address to that site

到目前为止我的尝试:

from bs4 import BeautifulSoup
import requests

url = "https://www.yellowpages.com.au/sa/gawler/mega-health-gawler-14366108-listing.html"

with requests.Session() as s:
s.headers["User-Agent"] = "Mozilla/5.0"
response = s.get(url)
soup = BeautifulSoup(response.text,"lxml")
name = soup.select_one("h1.listing-name").text
print(name)

我想要的内容不是动态生成的。此外,我在脚本中使用的选择器是完美无缺的。如何从该站点打印该商店的名称?

最佳答案

我已经修改了您的脚本以查看它从服务器获取的内容:

from bs4 import BeautifulSoup import requests

url = "https://www.yellowpages.com.au/sa/gawler/mega-health-gawler-14366108-listing.html"

with requests.Session() as s:
s.headers["User-Agent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"
response = s.get(url)
soup = BeautifulSoup(response.text,"lxml")
if soup is not None:
selected = soup.select_one("h1.listing-name")
if selected is not None:
name = soup.selected.text
print(name)
else:
print("Oh No!\n{}".format(soup))
else:
print("Ooops!\n{}".format(response))

然后我运行了它。结果是下面的验证码页面。您需要弄清楚如何绕过验证码,否则您的脚本将看不到内容,因此无法抓取它。

    Oh No!
<!DOCTYPE html>
<html class="no-js" lang="en">
<head>
<meta content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no" name="viewport"/>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<title>Yellow Pages® | Data Protection</title>
<link href="/favicon.ico?v=2" rel="shortcut icon"/>
<!--[if (lt IE 9)&!(IEMobile)]><script src="/assets/ie/respond.sensis-9575467dfbc008e5b0d486dc4f481624.js" type="text/javascript" ></script><![endif]-->
<!--[if (lt IE 10)&!(IEMobile)]><script src="/assets/ie/custom-event-ie9.js" type="text/javascript"
></script><![endif]-->
<!--[if (lt IE 10)&!(IEMobile)]><link rel="stylesheet" href="/assets/ie/gradient-hacks-ie89-12453d23f1fec3d9d46e56cc6e023576.css"/><![endif]-->
<script async="" defer="" src="https://www.google.com/recaptcha/api.js?"></script>
<meta content="NOINDEX, NOFOLLOW" name="ROBOTS"/>
</head>
<body id="" style="border-width: 0;
background-color: #EDEDED;
font-size: 85%;
line-height: 1.3;
margin: 0;
font-family: Helvetica, sans-serif;">
<div style="padding: 10px 15px;
height: 70px;
min-height: 45px;
background-color: #ffce00;
background-image: linear-gradient(to right, #ffce00, #fedb55, #ffce00);
box-shadow: inset 0px -5px 7px -5px rgba(0, 0, 0, 0.35);">
<div style="position: relative;
max-width: 1240px;
margin: 0 auto;">
<a href="/">
<img alt="Yellow Pages" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAIwAAACMCAYAAACuwEE+AAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAFa5JREFUeNrsXQl4FMXWrYQlLLIJAgYEBAFZZAeRIKs8FgU1GkBQCPIEcQH9hfwqAoryFEVx+9gVAYOCCgq4REXBCIYlbD4gQESW4BKRhE2IIel3T0/N2Pv0TCaTnpk633cI011d3VV1+9atW7eqo1hkIYZYk/MKYjViJeJl/G8UsaLmmvNEiXiWeI7/zSH+QczmzIuUCowK47LVIHYhdiReR2xNbFBM9zpC3E38kbiNuJl4kgk4GpWJ8cSFxAyuFUqSGfxZ4vmzCTgAscSHiRuJ+Q4QEjPm82d8mD+zQBBRgZhIXE8sdLCQmLGQP3siL4tAMaEFcQ4xNwSFxIy5vEwtRPMGDr2Jn4aRkJjxU15WAT/Rh5gWAYKiZRovu4BNdCZuiEBB0XIDrwsBE9QhJgtB0TGZ140AR2niRIVHVVDP87yOSke6sLQlpguBsM10XmcRh1LEyQ53tjnZCTiZ12FEoD4xVTR8kZnK6zKs0Y94SjR2wHiK12nYAbPiU4gFopEDzgJet2ETeVCeuFI0bLFzJa/rkAaClDaJxgwaN/E6D0nUJR4SjRh0HuJ1H1K4hnhcNF6J8Thvg5BAE+IJ0WglzhO8LRyNRsQs0ViOYRZvE8faLJmikRzHTCfaNAhy3i0ax7HczRwUiI4Z1BTRKI5nilNmu+eKxggZzilpYRklGiHkOKqkhKU98aJogJDjRd52fk8K+gOsQ95FbMgEQhGHiW2Ya514ULBEvKkhzyXBEpYEUdlhw4Ti7pKwI8J+/lcg9IEdJpqxYtxpYql4K8OOS4tLWPqIyg1b9gl0lwQPITbLuVZo8bAEzIxWxEuBynC8eAvDnuMDpWGq8nF7NfEihjVyuF8t1ypRtI2M/k8IS0QAbfxoUTUMhs9HmH5nSYHwBNZwN7AaZnvTMI8JYYkoVORt7peGwd61CPGrIuoxonCauSL0zvmqYUYLYYlIVOFt75OGieYjo/qi/iISR/mIqdCuhukrhCWigbb/ly9d0r9FnUU87rPbJeHDDVgEVVrUWUQD0wTYWy/bm4ZJEMIiwGXgTjtd0lBRVwIcd3nrkvDhhCwW3p/FEbAPTErCJ/OLmYa5RQiLgEah3GzVJfUTdSSgQX+zLqkM8U/mWkIiIOAGlqJUZ67tXlUappMQFgEDQCY6GnVJXUXdCJigq5HAxIl6ETBBnJENg8/qhsx6o2uuuYZ17apWit988w07duyY/P8777yTXXbZZZ5zv/32G/viiy9E0/sHyEZN5YFYFmJBy4mJiZIWt912m+f8kSNHVOc2bNggAr2LRvnjpu4pgFZmotWuXTs2aNAg1bHTp0+z2bNnex/ER0WxpKQkVr68er/hDz74gO3du1e8t6EFfPvb48BLMpOsSpUqSSdOnNC9zQMHDvQqlXfffbfuup9++kkqV66c0DChx0lKo9d0e86zZ8/KWkKLmTNnstKlzecooVVmzJihOz5hwgR28eJF8b6GHpoqu6QGVimXL1/OxowZw7p16+Y51qxZMzZ69Gg2f/58w2sgGPXq1VMd++yzz9i6des8vyFwt99+OyPNwNq0acMuv/xy9tdff7Gff/6Zbdy4kb377rvy/4sb1HOyvjcwNvhfjHVszthVtV3Hs35nbCv1nCu/ZCzlB3rNJPV148aNYzExMZ7fhYWF7I033qB0ku4eQ4cOZbVr1/b8LigokNMaYdiwYaxmTZWNyd5++2125syZkhQYlYx43S61VatW0qVLl1RqnkYeEo1EdGmvuOIKiewcVVrSKlKjRo08afr16ycdPnxYsgLuN2fOHMN7BKpLur4lk3YspzbeYc2d77nSKq+lUZfuGVq2bKm7R9myZSV6EfxO+8svv0jR0dFO2L7V0yV5Dcfcs2cPo8ZTHatVq5ZhdzVt2jRWubJ6l89Zs2Yxsl9cBhNd8/nnn7Orr77a8p6lSpWS3+Lvv/+ekRAG/JUZOZCx7xcz1tbGivE2TV1pRyns/08++USXrkePHrpjnTt31hn+QK9evXTHOnXqpEu7Zs0aWXuVMOopfTG2pKxKlSpSdna2SvrPnz8vxcbGetI0bdpUp4mOHj0qVahQQT5///33S/5g8+bNEnVhAdMwt/dkUsF275pFS1yDa5EHyq3Fhx9+qKu3p59+2rBMH3/8sS7tU089pUvXt29fpxi+UW47xvZF9957r65AixYt8pynt053/o477pDPNW7cWMrLy9OdX79+vXTXXXdJ7du3l7sq6tsN0z322GMBEZha1ZmU+51eGP7eyqSVM5mUNNJF/D9viz4drkUeyGvLli2q+/zxxx9SVFSUqs5IQxoKDLpt0qKqtF9//bUuDbophwiM/J3JMr5chMr44YcfVIUiA0667rrrpO7du+sq5auvvvJc+/777+vO4+0zuk+HDh2k3NxcVdrff/9dKlOmTJEF5tWJeiFIJzumQaz+OXBse7I+/WuTXOeffPJJS9sE9tfff/9tqjmpC7K0X9577z0nDa0hK6ysrxdCE1CfqioYjYCk7du3q46hoq699lr5GhoBSfn5+TrNYnWfUaNG6Sq4f//+RRKYmLJMOp2qbvyf1jKpWmXz58A5pFFegzyQV/PmzXXP8dBDD3muxfNqDXklnnjiCU/arl276vIaMmSIkwSmbGnmR4Rdeno6W7BgARs7dqznGFWMLt1rr73GMjIyPMag1m8DHw9pGNP70BunO4ahPQxmf9GpBWOVNavFp85lLMdixIpzU8jeT1a4lZAH8krduY9lZmbKc1tKw/fNN9+U/9+7d2/d8Pi++/5ZwYHzzz//vKHBTC9ckcpaHB4I/FPOH2mDxjh16pSpqoV3GF5id/rJkydLgcBHH31UJA0z9g61pihMZ1LF8t7LW6GcK63yWuSFcy+99JKpHbNz507VORoZSn/++afn94ULFzyeb639QsLiNG9vuWh/RY2EhVH/bXp+4sSJsgZxo1q1wGwxU7169SJdXz5G/fvMecbOX/B+3V8XXWmN8qLRjup4jRo1WIsWLeRnbd26tef4oUOHZEckdcWeYyQsrEuXLrI2xV8ltPk6AdFFuRjd0o4dO3THv/vuO0YGruoYGbABeeCi+iMu5Kl/o2upaONbrKRhdF2ZOy8aBDDSKjp/TM+ePeUJWDdSUlJUf5X+GK3/Bd5i+F+cBveQ2u/Gmzt3Llu4cKHquJF7/ODBg7rr0Xdv3brVp3v++uuvRSrwvsP6aYFbyXRY7sVUuK2nK61RXqgHNC6mSpQCc/Kkel8eGjGq/rpx00036ebXaLhe5LIWA6QiCQxAVr+tY99++61csdHR/yg1qG2rbo2G6rou6PDhw0UqMeaG0LUotcWzDzD2+SZzw7dqJcamj9N3ZVv3qr2+SoHp3r27SquiTlAHAIK8Dhw4wJo2lefzWIcOHVT14tTuCLISXVSBsQuobK0rHXE2sHWMEBcXx7Zt2yZXspvo+61myO0g72/G3tFo+oZ16K2nkVKDWH36+le6zjXSfAQPeSAvN8hglSdOlXaMcuSEbktp03355ZeqKZCOHTsKgdFi6tSpOu1DIwy5T09ISJCDtQYMGMDmzZsnh1sqZ4KB5OTkgKjp/7zN2GnN/krtmzF2YDVjK2YyljTSRfz/ILVbh+bqtLgWeajsmQsXdLaJUXdk9lsJuCKggRyIQp/mkpgfgUxaPvLII34Np0lQpCuvvNJRc0lajhgxwpZH15sHmGw7pwZRRbk1zKVgieirr77Knn32WZ+ugfE4cODAgBqBq8mc+Pd0KniBD/Zageua1d8an0esj9EoLicnR3Z2KnHu3Dm5mzKCQ7ujfHeXBBwN5p3RNcF+cUf4WwFBVzAKt2/fHvDnWEx2SNdRjO2yof13H3SlXbzG2jcFl4IWsL0QMKWF0o5RjgJ9HTkGCcfcw2rgCPPzg9i7du1izzzzjK4P9oa1a9fKfX58fDy79dZbZQcXIu7w5mVlZbG0tDTZl4P8/bkvNFnVqlU9v6mLMsxny38ZazfMFXE3BBF3LRirW8t1DhF322gktMIk4s4I06dPl6MF7dgrKF+ZMmXUQ/V9+wwj9hwAVQUuYiLIWdCai5Se3oNMQMAaB5QCs6ckngDdCtSvHWItFKYhMAxX+jcEggaVjJTIykcSGL+G2IgpIRtFFbIpGNyVj1jRhomPkFhbDc8olrEgMHz48OGm6eBuR7wJ5nWw5CUvL082qGGEYkiLITC0Vd26/7hxMacDg9ssP7j8MVlYv3592Ql5/PhxOb/Nmzf7nB/QvHlzecTYqlUreRoERj9mtDds2CAPCvLz823Vh7ucjRo1kgcP7rLi3qtXr5Y1dFEc9Uyx6tGNNaGiYew4CWnkJWVmZppel56eLsXFxcmaSgk4/IzyQ+Tc/v37TfPbvXu31K1bN9v5IRIR8S7enJWIodbGCGvLSQJmmQ/CPmn0VpTY4E+MpOj/S1pgUHAIgJYIIn/88ccNK2br1q26fJ977jnbXRsNY702MA3fbeWH2OYDBw54zS8hIUFebWEXiOs1auxJkyb59HIhJNbPZcpJhvN9JS0w+M28LHMxEpqaNWt60kCwigJtA1PXF9D8brnlFlmwfMVbb72lXoB3/fW6uGo7mDt3rj9t1cVIYOBBOuNkgQGTkpJ0lUD9t3yudevWuiBrNzIyMuTwztTUVMsGUzYwug2j5S7AwYMH5TVIGzdu1AW3m+WHubCcnBzDNDNmzJCD3iHwmzZtMswL3Q+zWIFx7NgxuUvEcpypU6caLm9B/TRs2NCXdjrDZcMQq5wuMPHx8bpKgIrHuTVr1ujOIb5WuxAMcbVma4WUDWzUKGhwaAllfldddZU8uektv/nz5+vOJycnG3YTo0eP1gn2jz/+6DmPXTC0y5YrVqyoy2fWrFnych0lx40b50s7rbKyhsc4XWBGjhxpuPSkWrVqugqGdmjXrp1xUHeFCtLevXtNGxiVr9UueDtvuOEGw/xiYmJkQ9osP3SnCPjWGt7udVZGNLLF3OWBNtGWdfDgwQHZSkVD1UcqtNFI65hiSaQT0a+ffivho0ePsj59+uii1rCkwyjmGECwE4K3MLlpBAyftctcsIuF2QwzhrHUFXii6rRAfC8CvpWgrofdfPPNpmU1ii5EOCfKhMlY0mye43jWFStWyHE5cIiSNmI0cpOD0Hbu3GkYBWkDkIVPvSVKdaKGgZNu/PjxujcuKytLPo8FYVpQ5VreH8tUtbtMuDXCo48+qstv0KBBXleFomswym/atGkBWWbzzjvveBYTWq2o1C63XbJkidS2bVtf2yhVKxxG8Y4I9y+RLVjh9ILDSgvqHmSHlNFSFfeCMaPdHbztLYOQA2gnxA5rod2fxeyNV72OkiSnwa4WWgRqmY07H8TXQDuRALHY2FjLa7CTxogRI9g999wj76zhQzzS+3YE5gNEB7AS+AQOBANdgV0gnOH111+X/w9VrAXZKV7zMEujjL91Q7krpxnM0tjx2NqBspuEhxmB5ImJiWzIkCHytiJWMc9Y8oLwC4R6LFu2zNutLnFZ8Cow2dzrG+/k6QEsCsPyXHfgNdzgOsdSXJzcl5sBLnxoLiMYRfdhm1crNz9c+1gJYQT33jhKQDtAw/kCbawRphKgZUHEQLds2ZJR1yMTa50QfKbFAw88YEdg1jDNx7Ws0N+pUwNnzpyRfRbKZbggdkzQAq58jF7M7k/ayXRUA1+FFnAaGg1d3XzhhRdM8zNatP/KK69Y1g/uVbVqVRUxqoI/p0ePHirWrl3bdApCa1fBprHRPj59qCSaR1gFVWBQMGz/oSU22Rk7dqzUq1cvy/kQ0jq6RqGRjeE1RvvcaP0me/bs0Z1fvXq1oRAOGzbM0CGozE+7uwUcfmaGOdlzhmvX4UPCnJUWcPa5N23SUutzsiEwR5gfq2InONEPw3xcweD28mKEhcYZPny4tG7dOlue2aFDhxqmgWAiPxrKy2kgRHbyGzBggO48/CfQmNhsCaO2OnXqSA8++KC8F44Wa9eu9fh8MDFp9Fzw8pItI7Vp00aeh6Ohti4dBMhLXU7wx0yA9ZYbSgKDCsfWZoGa+8EwGRN2gZxLwrDYH5CtIjVp0sSTj79bvwE0WrKqx1ze9n7h+VASGBB9uVVYg9EMs1UD16hRQ3bJB0pg0D0aTWFYAVpIOx0BYV66dKnPz7Nq1SrLcAne5n4DAVXnQklgGN/2FTtiecPixYvlbV29hSNg2sGq21Hmp93zzig/aELEp9hxvKE77dKli6mjcMqUKbopB7MXY/bs2d5iYs4FIohuenEJDAqAnSTdxO9A5o9JR8xQKzfwQd+PNxP9vHuST/kMCxYsMM0PRjdsAkxoKvfdg2HtnmNCfI4SiI8xy69evXrSiy++KAuFMlQBE5wQeHQddoKe6tatK89OY+9BpfBASPASvvzyy1KzZs3s1NkzgXB5YHHPqVCPScXS1EDuRokhr9FoSTuygU1lJz9MGiKux2rY7suzVa5c2dfrTvG2DgjGiyBo70TIpxbz5s0LlecfH0jHKjzC+yOp8cuXLy8PU2GDKIlIN6P00F5paWk6gUHIQQiUd19xTAX1iTSNkZKSohMAdDljxoyR41vcBuyNN95oKCywncycaQ5jn+KawlkaSQIDw9gqbvbkyZOmIZwA4oFDoJxLi3POrwZfoxIxQmMUZ2MH2ELfAV8g8cagfOdzcKR1TdjZG5/vsYtly5Y56fsAVkwIVnTBkkgTGsxeY6kHXPRGQNeFQHDMF4VImZb40/D+xu7ia+nYuKUhizAg5gR78TVu3FhekoqdMvGJY8TZZmdnh0oxEDrYBnFiwbxpe+JF4X8JOV7kbVciGCUaIOQ4qqTV21zRCCHDOU7oD+EhTBGN4XimsBII7DcDvgi6WzSKY7mbt5GjgF10MkXjOI6ZvG0cCazZyBKN5BhmMT+30w0mmhBPiMYqcZ7gbRESwDaXx0WjlRiP8zYIKaDfPCQaL+g85GSbxRuwQn6TaMSgcROv85AGPmS4UjRmsXMlr+uwACY5pxALRMMGnAW8bqNYGAILvE+JRg4YTzEfF82HIuqzIO5yFcZM5XUZEShFnMz4V74EfWI+r7tSLALRlpguhMA203mdRTQwg4pvEZ8XAmHK87yOSjMBD+oQk4Vw6JjM60bABJ2JG4SgyHXQWYiDfWA1XloECkoaK8aViJGA3sy1G3W4C8qnvKwCAQL2M0VMam4YCUkuL1ML0bzFB+zEnEhcTywMQSEp5M+eyMsiEERgz/SHuYHoZCdgPn/Gh/kzCzgACHLG7uULiRkOEJIM/izxzIEB2P4iKowFCLsS4NNzHYn4+kRrYoNiutcR5orMxz7124ibmesrvWGHqAjTQjHEmpwINMKnQbAnbSVO1EdFzTVuD/RZTuw0mcNcW2Vkc+ZFSgVG4ZMtAgJ28T8BBgAcyn1tKfpknwAAAABJRU5ErkJggg==" style="width: 70px;"/>
</a>
</div>
</div>
<div style="padding-top: 10px;">
<div style="margin-left: auto;
margin-right: auto;
max-width: 600px;
vertical-align: top;">
<div style="background-color: #FFFFFF;
border-radius: 8px;
padding: 1px 10px;">
<h1 style="font-weight: normal;">We have detected unusual traffic activity originating from your IP address.</h1>
<div style="border-bottom: 1px #E7E7E7 solid;
margin-top: 20px;
margin-bottom: 20px;
height: 1px;
width: 100%;">
</div>
<div style="margin-left: auto;
margin-right: auto;
font-size: 20px;
max-width: 460px;
text-align: center;">
We value the quality of content provided to our customers, and to maintain this, we would like to ensure real humans are accessing our information.</div>
<div style="margin-left: auto;
margin-right: auto;
margin-top: 30px;
max-width: 305px;">
<form action="/dataprotection" method="post" name="captcha" style="margin: 0; padding: 0; word-wrap: break-word; display: block;">
<div class="g-recaptcha" data-sitekey="6LeukxwTAAAAANIgmFm7-cOKIY4avRNHiDB9xAD8"></div>
<noscript>
<div style="width: 302px; height: 352px;">
<div style="width: 302px; height: 352px; position: relative;">
<div style="width: 302px; height: 352px; position: absolute;">
<iframe frameborder="0" scrolling="no" src="https://www.google.com/recaptcha/api/fallback?k=6LeukxwTAAAAANIgmFm7-cOKIY4avRNHiDB9xAD8" style="width: 302px; height:352px; border-style: none;">
</iframe>
</div>
<div style="width: 250px; height: 80px; position: absolute; border-style: none;
bottom: 21px; left: 25px; margin: 0px; padding: 0px; right: 25px;">
<textarea class="g-recaptcha-response" id="g-recaptcha-response" name="g-recaptcha-response" style="width: 250px; height: 80px; border: 1px solid #c1c1c1;
margin: 0px; padding: 0px; resize: none;" value="">
</textarea>
</div>
</div>
</div>
</noscript>
<input name="path" type="hidden" value="/sa/gawler/mega-health-gawler-14366108-listing.html"/>
<div style="margin-left: auto;
margin-right: auto;
text-align: center;
padding: 15px 0;
max-width: 260px;
margin-top: 30px;">
<button class="submit" style="width: 100%;
color: black;
padding: 10px 25px;
border-radius: 25px;
cursor: pointer;
border: none;
position: relative;
background-color: #ffce00;
display: inline-block;
text-align: center;
box-sizing: border-box;">Submit</button>
</div>
</form>
</div>
<div style="border-bottom: 1px #E7E7E7 solid;
margin-top: 20px;
margin-bottom: 20px;
height: 1px;
width: 100%;"></div>
<p style="font-weight: bold;">Why did this happen?</p>
<p style="margin-top: 20px;">This page appears when online data protection services detect requests coming from your computer network which appear to be in violation of our website's terms of use.</p>
</div>
</div>
</div>
</body>
</html>

We have detected unusual traffic activity originating from your IP address. We value the quality of content provided to our customers, and to maintain this, we would like to ensure real humans are accessing our information.

我想合乎道德的做法是与网页管理员一起工作,或者至少征求许可。

关于python - 我的脚本无法从网页获取食品店的名称,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53269763/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com