我正在抓取这个页面 http://www.modeluxproperties.com/?act=list_web&m=search&purpose=sale&project=&type=32&beds=&lop=&Submit.x=37&Submit.y=20
我想获取 parking
属性的值:
html是这样的:
<span class="smallredtext" style="font-size:12px;">
<img src="images/listwebpoint.png" width="6" height="6"> Status: for <b>Sale</b>
<img src="images/listwebpoint.png" width="6" height="6"> Ref No: <b>AFS503</b>
<img src="images/listwebpoint.png" width="6" height="6"> BUA: <b>1700 Sq.Ft.</b>
<img src="images/listwebpoint.png" width="6" height="6"> Bedroom: <b>2</b>
<img src="images/listwebpoint.png" width="6" height="6"> Bathroom: <b>3</b>
<img src="images/listwebpoint.png" width="6" height="6"> Parking: <b>1</b>
</span>
这是我的 xpath:
.//span[@class='smallredtext'][normalize-space(text())=Parking:]/following-sibling::b[1]/text()
我遇到了这个错误:
raise ValueError("Invalid XPath: %s" % query)
ValueError: Invalid Xpath: //span[@class='smallredtext'][normalize-space(text())=Parking:]/following-sibling::b[1]/text()
我在 python 0.27 中使用 scrapy
找到 b
标签并检查 precending-sibling
:
.//span[@class='smallredtext']/b[preceding-sibling::text()=' Parking: ']/text()
UPD(使用normalize-space()
):
.//span[@class='smallredtext']/b[preceding-sibling::text()[normalize-space() = 'Parking:']]/text()
我是一名优秀的程序员,十分优秀!