gpt4 book ai didi

java - Selenium Web 驱动程序 getPageSource() 放错了包含转义值的属性和值

转载 作者:行者123 更新时间:2023-12-01 12:30:09 24 4
gpt4 key购买 nike

在使用 selenium 时,刚才我在解析 selenium getPageSource() 方法的输出时遇到错误。firefox页面源实际的meta标签=

  <meta name="news_keywords" content="devo max,independence vote,no campaign,referendum,scotland \"no\" vote,scotland independence,scotland powers,scotland referendum,scotland vote,scottish referendum" />

使用带有 selenium = 的 Firefox 驱动程序的 getPageSource() 方法结果

<meta referendum"="" vote,scottish="" referendum,scotland="" powers,scotland="" independence,scotland="" vote,scotland="" no\"="" content="devo max,independence vote,no campaign,referendum,scotland \" name="news_keywords" />

这非常荒谬,并且在进一步处理 html 输出时产生问题。有什么建议、帮助或解决方法吗?

最佳答案

来自文档:

getPageSource

java.lang.String getPageSource()

Get the source of the last loaded page. If the page has been modified after loading (for example, by Javascript) there is no guarantee that the returned text is that of the modified page. Please consult the documentation of the particular driver being used to determine whether the returned text reflects the current state of the page or the text last sent by the web server. The page source returned is a representation of the underlying DOM: do not expect it to be formatted or escaped in the same way as the response sent from the web server. Think of it as an artist's impression.

Returns: The source of the current page

http://selenium.googlecode.com/git/docs/api/java/org/openqa/selenium/WebDriver.html#getPageSource%28%29

关于java - Selenium Web 驱动程序 getPageSource() 放错了包含转义值的属性和值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25988884/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com