gpt4 book ai didi

python - 如何使用groupby将3元组列表转换为元组列表

转载 作者:行者123 更新时间:2023-12-03 21:27:50 27 4
gpt4 key购买 nike

我下面有一个字符串

test = '''AWS-1 - opened at Jan 23 2010 10:30:08AM 
AWS-2 - opened at Jan 23 2010 11:04:56AM
AWS-2 - closed at Jan 23 2010 1:18:32PM
AWS-1 - closed at Jan 23 2010 9:43:44PM
AWS-1 - opened at Feb 1 2010 12:40:28AM
AWS-1 - closed at Jan 23 2010 9:43:44PM
'''


我的密码

import re
from itertools import groupby
y = re.findall(r'\b(\w+-\d+)\s+-\s+(\w+[-.\w]+)\s+at\s+(\w+[\s:.\w]+)\n', test)
print (y)
for key, time in groupby(y,lambda z: y[2]):
for thing in y:
print( (y[1], key))
print (" ")


我的出

(('AWS-2', 'opened', 'Jan 23 2010 11:04:56AM '), ('AWS-2', 'closed', 'Jan 23 2010 1:18:32PM '))
(('AWS-2', 'opened', 'Jan 23 2010 11:04:56AM '), ('AWS-2', 'closed', 'Jan 23 2010 1:18:32PM '))
(('AWS-2', 'opened', 'Jan 23 2010 11:04:56AM '), ('AWS-2', 'closed', 'Jan 23 2010 1:18:32PM '))
(('AWS-2', 'opened', 'Jan 23 2010 11:04:56AM '), ('AWS-2', 'closed', 'Jan 23 2010 1:18:32PM '))
(('AWS-2', 'opened', 'Jan 23 2010 11:04:56AM '), ('AWS-2', 'closed', 'Jan 23 2010 1:18:32PM '))


预期不会出现 AWS-1,而是到处出现 AWS-2

(('AWS-1', 'opened', 'Jan 23 2010 10:30:08AM '), ('AWS-1', 'closed', 'Jan 23 2010 9:43:44PM '))
(('AWS-1', 'opened', 'Feb 1 2010 12:40:28AM'), ('AWS-1', 'closed', 'Feb 23 2010 9:43:44PM'))
(('AWS-2', 'opened', 'Jan 23 2010 11:04:56AM '), ('AWS-2', 'closed', 'Jan 23 2010 1:18:32PM '))

最佳答案

您的请求不清楚,但是您似乎希望基于参数进行开闭对。

给定

import re

import dateutil


records = """\
AWS-1 - opened at Jan 23 2010 10:30:08AM
AWS-2 - opened at Jan 23 2010 11:03:56AM
AWS-2 - closed at Jan 23 2010 1:18:32PM
AWS-1 - closed at Feb 27 2010 9:32:50PM
AWS-1 - opened at Feb 1 2010 12:50:28AM
AWS-1 - closed at Jan 23 2010 9:32:50PM
"""




def splitlines(s: str) -> tuple:
"""Return tuples of parsed lines: id, status, time."""
res = []

for line in s.split("\n"):

if not line:
continue
parsed = tuple(map(str.strip, filter(None, re.split("(\s-\s)|(at)", line))))
id_, _, status, _, time = parsed
data = id_, status, dateutil.parser.parse(time)
res.append(data)

return tuple(res)


def pairwise_records(s: str) -> list:
"""Return paired records according to id, status and time."""
key = lambda x: (x[0], x[2], x[1])

sorted_recs = ((i, s, str(t)) for i, s, t in sorted(splitlines(s), key=key))

return list(zip(sorted_recs, sorted_recs))


演示版

pairwise_records(records)


输出量

[(('AWS-1', 'opened', '2010-01-23 10:30:08'),
('AWS-1', 'closed', '2010-01-23 21:32:50')),
(('AWS-1', 'opened', '2010-02-01 00:50:28'),
('AWS-1', 'closed', '2010-02-27 21:32:50')),
(('AWS-2', 'opened', '2010-01-23 11:03:56'),
('AWS-2', 'closed', '2010-01-23 13:18:32'))]




细节

OP对使用精心设计的正则表达式得到部分答案表示赞赏。事实证明,使用更清晰的正则表达式并且不使用 groupby,您可以更明确地执行此操作。

splitlines

我们尝试将输入字符串解析为元组。我们使用 re.split进行此操作,它留下了不需要的额外元素。这些多余的内容可以通过 filter清除,也可以通过解包到 (id_, status, time)中进行解析,其中 time被解析为 datetime对象。结果是已解析行的元组。例:

splitlines("AWS-1 - opened at Jan 23 2010 10:30:08AM")
# (('AWS-1', 'opened', datetime.datetime(2010, 1, 23, 10, 30, 8)),)


pairwise_records

我们通过id和 datetime对元组进行排序。按时间排序自然会成对排列顺序。例如,如果某个东西在上午9点打开,那么它必须在以后的某个时间关闭;时间是自然排序的。最后,我们使用带有迭代器的“技巧”将结果配对在一起。

关于python - 如何使用groupby将3元组列表转换为元组列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58959495/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com