gpt4 book ai didi

selenium - 拒绝加载脚本,因为它违反了以下内容安全策略指令 : script-src error with ChromeDriver Chrome and Selenium

转载 作者:行者123 更新时间:2023-12-04 15:37:41 33 4
gpt4 key购买 nike

我正在尝试从这些链接“https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421”和“https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154”中抓取电话号码

如果元素存在,它会抓取电话号码,否则电话号码为 None

蜘蛛代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException

options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('window-size=1200x600')

driver = webdriver.Chrome(chrome_options=options)

driver.get('https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421')

WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//p[@data-a-target='carousel-broadcaster-displayname']"))
)
try:
next1 = driver.find_element_by_xpath('//*[@class="c-btn--light c-btn--center"]')
next1.click()

next2 = driver.find_element_by_xpath('//*[@class="u-title-font icon-ic_call_filled u-valign--middle"]')
next2.click()
phone_number = driver.find_element_by_class_name('c-vn__number').get_attribute('innerHTML')
except NoSuchElementException:
phone_number = None

print(phone_number)

输出

DevTools listening on ws://127.0.0.1:60482/devtools/browser/9f226a40-2d1a-4108-9fde-f005b49e60b3
[1206/102937.475:INFO:CONSOLE(0)] "[Report Only] Refused to load the script
'https://www.googletagmanager.com/gtag/js?id=AW-942004674' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-3RJz12sDPuoV27qS7dcBXLRZawmPobLo' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". 'strict-dynamic' is present, so host-based whitelisting is disabled. Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.

", source: https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 (0)
[1206/125829.645:INFO:CONSOLE(33)] "[Report Only] Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-eNRfqc27QHPklLLhavu92zuUGDeEoSZL' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". Note that 'unsafe-inline' is ignored if either a hash or nonce value is present in the source list.
", source: https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 (33)
[1206/125829.829:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)
[1206/125830.508:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)

最佳答案

这个错误信息...

[1206/102937.475:INFO:CONSOLE(0)] "[Report Only] Refused to load the script 
'https://www.googletagmanager.com/gtag/js?id=AW-942004674' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-3RJz12sDPuoV27qS7dcBXLRZawmPobLo' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". 'strict-dynamic' is present, so host-based whitelisting is disabled. Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.
.
[1206/125830.508:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)

...暗示 ChromeDriver 无法启动/生成新的浏览上下文,即 Chrome 浏览器 session 。


内容安全策略 (CSP)

为了缓解跨站点脚本问题,Chrome 的扩展系统实现了 Content Security Policy (CSP) 的概念。它引入了一些严格的政策,默认情况下使扩展程序更安全,并使我们能够创建和执行规则来管理您的扩展程序和应用程序可以加载和执行的内容类型。 CSP 作为您的扩展程序加载或执行的资源的阻止/白名单机制。为您的扩展程序定义合理的策略使您能够考虑您的扩展程序需要的资源并与浏览器协商以确保这些是您的扩展程序可以访问的唯一资源。这些策略提供的安全性甚至高于 host permissions您的延期请求充当额外的保护层。此类策略是通过 HTTP header 或元元素定义的。在 Chrome 的扩展系统中,扩展的策略是通过扩展的 manifest.json 定义的。文件如下:

{
"content_security_policy": "[POLICY STRING GOES HERE]"
}

放宽内容安全政策

在 Chrome 45 之前,没有机制可以放宽对执行内联 JavaScript 的限制。特别是,设置包含“不安全内联”的脚本策略将无效。但是,从 Chrome 46 开始,可以通过在策略中指定源代码的 base64 编码哈希来允许内联脚本。此散列必须以使用的散列算法(sha256、sha384 或 sha512)为前缀。这可以通过设置将 http://* 添加到 style-src 和/或 script-src 来实现> 如下:

script-src 'self' http://xxxx 'unsafe-inline' 'unsafe-eval'; 

和/或

style-src 'self' http://xxxx 'unsafe-inline' 'unsafe-eval';

这个用例

但是我能够访问网页 https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154简单如下:

  • 代码块:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC

    options = webdriver.ChromeOptions()
    options.add_argument('window-size=1200x600')
    options.add_argument('--headless')
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get("https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154")
    print(driver.page_source)
    driver.quit()
  • 控制台输出:

    <html><head><title>Dr. Rajeev Puri - ENT/ Otorhinolaryngologist - Book Appointment Online, View Fees, Feedbacks | Practo</title><meta name="description" content="Dr. Rajeev Puri is an ENT/ Otorhinolaryngologist in DLF Phase IV. Book appointments Online, View Fees, User Feedbacks for Dr. Rajeev Puri | Practo"><meta charset="utf-8"><meta http-equiv="x-ua-compatible" content="ie=edge"><script src="https://js-agent.newrelic.com/nr-spa-1026.min.js"></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="https://surveys-static.survicate.com/widget_core-3.0.4.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script type="text/javascript" async="" src="https://www.googleadservices.com/pagead/conversion_async.js" nonce=""></script><script type="text/javascript" async="" src="https://www.google-analytics.com/plugins/ua/ec.js" nonce=""></script><script type="text/javascript" async="" src="https://www.practostatic.com/pel/clevertap/a.js"></script><script async="" src="//sweep.practo.com/sp.js"></script><script type="text/javascript" src="https://www.practostatic.com/pel/pel-1.6.1.js"></script><script async="" src="https://connect.facebook.net/en_US/fbevents.js"></script><script async="" src="//www.google-analytics.com/analytics.js"></script><script async="" src="https://www.googletagmanager.com/gtm.js?id=GTM-PSMVGL5"></script><script nonce="" type="text/javascript">(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
    new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
    j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
    'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
    })(window,document,'script','dataLayer',"GTM-PSMVGL5");</script>

其他注意事项

确保:


引用

您可以在 Call to eval() blocked by CSP with Selenium IDE 中找到相关讨论

关于selenium - 拒绝加载脚本,因为它违反了以下内容安全策略指令 : script-src error with ChromeDriver Chrome and Selenium,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59207838/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com