gpt4 book ai didi

python - 如何获取xpath中的src链接

转载 作者:行者123 更新时间:2023-12-01 02:32:04 25 4
gpt4 key购买 nike

这是 html

<div class="c" id="M_Fp01sdJgm">
<div>
<a class="nk" href="https://weibo.cn/thebs">figre</a>
<img src="https://h5.sinaimg.cn/upload/2016/05/26/319/5338.gif" alt="V"/>
<img src="https://h5.sinaimg.cn/upload/2016/05/26/319/donate_btn_s.png" alt="M"/>
<span class="ctt">
":"resampling
<span class="kt">resampling</span>
":Cleantech entrepreneurs are splicing genes in the search for greener fuels
​</span>&nbsp;
[<a href="https://weibo.cn/mblog/picAll/Fp01sdJgm?rl=2">2 pieces of the package</a>
</div>
<div>
<a href="https://weibo.cn/mblog/pic/Fp01sdJgm?rl=1">
<img src="http://wx1.sinaimg.cn/wap180/3ed2e6e8gy1fk7hohl2i5j219s0ps4qp.jpg" alt="images" class="ib" />
</a>&nbsp;
<a href="https://weibo.cn/mblog/oripic?id=Fp01sdJgm&amp;u=3ed2e6e8gy1fk7hohl2i5j219s0ps4qp">image</a>&nbsp;
<a href="https://weibo.cn/attitude/Fp01sdJgm/add?uid=5757914684&amp;rl=1&amp;st=7b15a6">praise[28094]</a>&nbsp;
<a href="https://weibo.cn/repost/Fp01sdJgm?uid=1054009064&amp;rl=1">transmit[1164]</a>&nbsp;
<a href="https://weibo.cn/comment/Fp01sdJgm?uid=1054009064&amp;rl=1#cmtfrm" class="cc">comment[4097]</a>&nbsp;<a href="https://weibo.cn/fav/addFav/Fp01sdJgm?rl=1&amp;st=7b15a6">save</a>
"<!---->&nbsp;"
<span class="ct">10月05日 20:08&nbsp;from iPhone 7 Plus

我尝试编写以下内容,其他字段已获取。但是'img'为空

def get_user_data(self,start_url):
html = requests.get(url=start_url,headers=self.headers,cookies=self.cookies).content
selector = etree.fromstring(html,etree.HTMLParser(encoding='utf-8'))
all_user = selector.xpath('//div[contains(@class,"c") and contains(@id,"M")]')
for i in all_user:
user_id = i.xpath('./div[1]/a[@class="nk"]/@href')
content = i.xpath('./div[1]/span[1]')[0]
contents = content.xpath('string(.)')
if i.xpath('./div[2]'):
img = selector.xpath('./div[2]/a/img/@src') #img is None
praise_num = i.xpath('./div[2]/a[3]/text()')
transmit_num = i.xpath('./div[2]/a[4]/text()')
else:
img = ''
praise_num = i.xpath('./div[2]/a[3]/text()')
transmit_num = i.xpath('./div[2]/a[4]/text()')

我应该怎么写“img”?然后我可以通过zip()来处理它们?因为我要保存mysql

最佳答案

试试这个(你的图像位于 div[1] 下)

img = i.xpath('./div[1]/a/img/@src') 

关于python - 如何获取xpath中的src链接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46688439/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com