gpt4 book ai didi

html - Xpath选择带有多个空格和换行符的html

转载 作者:行者123 更新时间:2023-12-02 20:55:56 25 4
gpt4 key购买 nike

我正在尝试选择一个带有包含多个空格和换行符的 class 属性的 div 。下面是一个片段。我想选择所有带有 test-onetopit 的 div,如下所示:

<div class="test-one
topit
">


<div class='test-one a'>1
</div>
<div class='topit'>2
</div>
</div>

<div class="test-one
topit
">


<div class='test-one a'>1
</div>
<div class='topit'>2
</div>
</div>

这是我尝试过的:

"//div[contains(concat(' ', normalize-space(@class), ' '), ' topranks ') and contains(concat(' ', normalize-space(@class), ' ), ' list-node ')]"

//*[contains(concat(' ', normalize-space(@class), ' '), ' atag ')]

我尝试改进的来源:

XPath - How to select by @text that contains new line

How can I match on an attribute that contains a certain string?

最佳答案

cssselect

cssselect.GenericTranslator().css_to_xpath('div.test-one.topit')
# "descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' test-one ') and (@class and contains(concat(' ', normalize-space(@class), ' '), ' topit '))]"
tree = lxml.html.parse('http://www.made-in-china.com/companysearch.do?xcase=hunt&order=0&style=b&page=1&word=bag&size=30&sizeHasChanged=0&memberLevel=blank&sgsMembershipFlag=&comProvince=nolimit&comCity=&cateCode=&comBusinessType=blank&numEmployees=&annualRevenue=&code=0&managementCertification=').getroot()

tree.cssselect('div.list-node.topranks')
# [<Element div at 0x7f62e732dd18>, <Element div at 0x7f62e72d1f48>, <Element div at 0x7f62e72eb188>, <Element div at 0x7f62e72eb0e8>, <Element div at 0x7f62e72eb138>, <Element div at 0x7f62e72eb1d8>, <Element div at 0x7f62e72eb228>, <Element div at 0x7f62e72eb278>, <Element div at 0x7f62e72eb2c8>, <Element div at 0x7f62e72eb318>]

关于html - Xpath选择带有多个空格和换行符的html,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31838672/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com