gpt4 book ai didi

python - 使用 beautiful soup 选择文本数据

转载 作者:太空宇宙 更新时间:2023-11-03 16:36:06 25 4
gpt4 key购买 nike

好吧,我正在尝试使用 python beautiful soup 从下面的 html 中选择文本数据,但我遇到了麻烦。基本上 <b> 中有一个标题。 ,但我想要除此之外的数据。例如第一个是评估类型,但我只想要容量曲线。这是我到目前为止所拥有的:

modelinginfo = soup.find( "div", {"id":"genInfo"} ) # this is my raw data
rows=modelinginfo.findChildren(['p']) # this is the data displayed below
for row in rows:
print(row)
print('/n')
cells = row.findChildren('p')
for cell in cells:
value = cell.string
print("The value in this cell is %s" % value)


[<p><b>Assessment Type: </b>Capacity curve</p>,
<p><b>Name: </b>Borzi et al (2008) - Capacity-Xdir 4Storeys InfilledFrame NonSismicallyDesigned</p>,
<p><b>Category: </b>Structure specific - Building</p>,
<p><b>Taxonomy: </b>CR/LFINF+DNO/HEX:4 (GEM)</p>,
<p><b>Reference: </b>The influence of infill panels on vulnerability curves for RC buildings (Borzi B., Crowley H., Pinho R., 2008) - Proceedings of the 14th World Conference on Earthquake Engineering, Beijing, China</p>,
<p><b>Web Link: </b><a href="http://www.iitk.ac.in/nicee/wcee/article/14_09-01-0111.PDF" style="color:blue" target="_blank"> http://www.iitk.ac.in/nicee/wcee/article/14_09-01-0111.PDF</a></p>,
<p><b>Methodology: </b>Analytical</p>,
<p><b>General Comments: </b>Sample Data: A 4-storey building designed according to the 1992 Italian design code (DM, 1992), considering gravity loads only, and the Decreto Ministeriale 1996 (DM, 1996) when considering seismic action (the seismically designed building has been designed assuming a lateral force equal to 10% of the seismic weight, c=10%, and with a triangular distribution shape).

The Y axis in the capacity curve represent the collapse multiplier: Base shear resistance over seismic weight.</p>,
<p><b>Geographical Applicability: </b> Italy</p>]

最佳答案

1.) 您可以迭代 p children并打印除 b 标记之外的所有内容:

for cell in cells:
for element in cell.children:
if element.name != 'b':
print("The value in this cell is %s" % element)

2.) 您可以使用extract()清理不需要的 b 标记的方法:

for cell in cells:
if cell.b:
# remove "b" tag
cell.b.extract()
print("The value in this cell is %s" % cell)

关于python - 使用 beautiful soup 选择文本数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37195294/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com