gpt4 book ai didi

python 从文件中删除特定行

转载 作者:行者123 更新时间:2023-12-01 02:19:56 25 4
gpt4 key购买 nike

我想删除此 Html 文件中的特定行。我想查看字符串 STARTDELETE 的位置,然后从那里删除 +1 到字符串 ENDELETE -1

为了更好地理解,我用“xxx”标记了要删除的行。我怎样才能用Python做到这一点?

<!DOCTYPE html>
<html lang="en">
<head>
<title>Bootstrap Example</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"></script>
</head>
<body>
<div class="container">
<h2>Image Gallery</h2>
<div class="row"> <!--STARTDELETE-->
xxx<div class="col-xs-3">
xxx<div class="thumbnail">
xxx<a href="/w3images/lights.jpg" target="_blank">
xxx<img style="padding: 20px" src="xxx" alt="bla" >
xxx<div class="caption">
xxx<p>Test</p>
xxx</div>
xxx</a>
xxx</div>
xxx</div>
</div> <!--ENDDELETE-->
</div>
</body>
</html>

最佳答案

安装beautifulsoup4 (HTML 解析器/DOM 操纵器)

读取数据,使用 beautifulsoup 获取“DOM”(一种...可步行的结构),获取您想要清空的项目,然后 remove its children .

在您的示例中,您似乎想清空 <div>(s)谁的class=row , 正确的?假设您的 HTML 数据存储在名为 data.html 的文件中。 (在您的特定情况下,这可能不会是这样......它将是请求的正文或类似的内容)

from bs4 import BeautifulSoup
with open('data.html', 'r') as page_f:
soup = BeautifulSoup(page_f.read(), "html.parser")
# In `soup` we have our "DOM tree"

divs_to_empty = soup.find("div", {'class': 'row'})
for child in divs_to_empty.findChildren():
child.decompose()

print(soup.prettify())

输出:

<!DOCTYPE html>
<html lang="en">
<head>
<title>
Bootstrap Example
</title>
<meta charset="utf-8"/>
<meta content="width=device-width, initial-scale=1" name="viewport"/>
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet"/>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js">
</script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js">
</script>
</head>
<body>
<div class="container">
<h2>
Image Gallery
</h2>
<div class="row">
<!--STARTDELETE-->
</div>
<!--ENDDELETE-->
</div>
</body>
</html>

如果你要进行 DOM 操作,我强烈建议你阅读并使用 beautiful soup(它非常强大)

关于python 从文件中删除特定行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48054822/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com