gpt4 book ai didi

regex - Python读取带有开始和停止条件的文件

转载 作者:太空宇宙 更新时间:2023-11-04 05:41:11 25 4
gpt4 key购买 nike

嗨,我有一个下面的文件数据,我正在寻找处理它以获得预期的输出,只是想知道作为一个Python学习者是否有办法基于开始和停止 bool 索引来实现这一点。

这里的文件行以名为 SRV: 的字符串开头,但在某些情况下这些行始终在同一行开始和结束,而在某些情况下这些行会扩展为换行符。

文件文本数据:

SRV: this is for bryan

SRV: this is for terry

SRV: this is for torain
sec01: This is reserved
sec02: This is open for all
sec03: Closed!

SRV: this is for Jun

预期输出:

SRV: this is for bryan

SRV: this is for terry

SRV: this is for torain sec01: This is reserved sec02: This is open for all sec03: Closed!

SRV: this is for Jun

有没有一种 Pythonic 方法可以更好地实现这一点,我也可以使用 pandas。

最佳答案

使用Series.str.startswithSeries.cumsum对于组,然后按 GroupBy.agg 进行聚合与加入:

df1 = (df['col'].groupby(df['col'].str.startswith('SRV').cumsum())
.agg(' '.join)
.reset_index(drop=True)
.to_frame(name='new'))
print (df1)
new
0 SRV: this is for bryan
1 SRV: this is for terry
2 SRV: this is for torain sec01: This is reserve...
3 SRV: this is for Jun

详细信息:

print (df['col'].str.startswith('SRV').cumsum())
0 1
1 2
2 3
3 3
4 3
5 3
6 4
Name: col, dtype: int32

对于DataFrame使用:

import pandas as pd

temp=u"""col
SRV: this is for bryan

SRV: this is for terry

SRV: this is for torain
sec01: This is reserved
sec02: This is open for all
sec03: Closed!

SRV: this is for Jun"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), sep="|")

print (df)
col
0 SRV: this is for bryan
1 SRV: this is for terry
2 SRV: this is for torain
3 sec01: This is reserved
4 sec02: This is open for all
5 sec03: Closed!
6 SRV: this is for Jun

纯Python解决方案:

out = []
with open("file.csv") as f1:
last = 0
for i, line in enumerate(f1.readlines()):
if line.strip().startswith('SRV'):
last = i
out.append([line.strip(), last])

from itertools import groupby
from operator import itemgetter

with open("out_file.csv", "w") as f2:
groups = groupby(out, key=itemgetter(1))
for _, g in groups:
gg = list(g)
h = ' '.join(list(map(itemgetter(0), gg)))
f2.write('\n' + h)

关于regex - Python读取带有开始和停止条件的文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57344763/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com