gpt4 book ai didi

python - Pandas 数据归约与合并

转载 作者:太空宇宙 更新时间:2023-11-04 10:09:00 24 4
gpt4 key购买 nike

我正在使用如下所示的 Pandas(0.17.1 版)DataFrame:

                         time   type   module     msg_type         content
36636 2016-08-25 17:59:50.051 INFO MOD_1_NAME STATUS Received Status Monitoring from MODULE_1 'Property A' = some_value_1
36637 2016-08-25 17:59:50.051 INFO MOD_1_NAME STATUS Received Status Monitoring from MODULE_1 'Property B' = some_value_2
36638 2016-08-25 17:59:50.051 INFO MOD_1_NAME STATUS Received Status Monitoring from MODULE_1 'Property C' = some_value_3
36639 2016-08-25 17:59:50.051 INFO MOD_1_NAME STATUS Received Status Monitoring from MODULE_1 'Property D' = some_value_4
36715 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 1' = some_value_a
36716 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 2' = some_value_b
36717 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 3' = some_value_c
36718 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 4' = some_value_d
36719 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 5' = some_value_e
36720 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 6' = some_value_f
36721 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 7' = some_value_g
36722 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 8' = some_value_h
36723 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 9' = some_value_i
36724 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 10' = some_value_j
36725 2016-08-25 17:59:50.964 ERROR MOD_2_NAME STATUS Didn't receive Status Monitoring 'Parameter 11' from MODULE_2!
36726 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 12' = some_value_k
36727 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 13' = some_value_l
36785 2016-08-25 18:59:50.051 INFO MOD_1_NAME STATUS Received Status Monitoring from MODULE_1 'Property A' = some_value_1
36786 2016-08-25 18:59:50.051 INFO MOD_1_NAME STATUS Received Status Monitoring from MODULE_1 'Property B' = some_value_2
36787 2016-08-25 18:59:50.051 INFO MOD_1_NAME STATUS Received Status Monitoring from MODULE_1 'Property C' = some_value_3
36788 2016-08-25 18:59:50.051 INFO MOD_1_NAME STATUS Received Status Monitoring from MODULE_1 'Property D' = some_value_4
36827 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 1' = some_value_a
36828 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 2' = some_value_b
36829 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 3' = some_value_c
36830 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 4' = some_value_d
36831 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 5' = some_value_e
36832 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 6' = some_value_f
36833 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 7' = some_value_g
36834 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 8' = some_value_h
36835 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 9' = some_value_i
36836 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 10' = some_value_j
36837 2016-08-25 19:01:50.964 ERROR MOD_2_NAME STATUS Didn't receive Status Monitoring 'Parameter 11' from MODULE_2!
36838 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 12' = some_value_k
36839 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS Received Status Monitoring from MODULE_2 'Parameter 13' = some_value_l

(框架已经被缩减以删除不感兴趣的行。这就是索引列缺少数字的原因)

如您所见,同时从设备读取了多个参数。每个读数都是单独的一行。我想做一些“减少”和“压缩”,以便每个读数只有一行。我还希望 content 列成为一个字典,这样我就可以轻松地查找感兴趣的特定项目。所以结果看起来像这样:

                         time   type   module     msg_type         content
36636 2016-08-25 17:59:50.051 INFO MOD_1_NAME STATUS {'Property A' = 'some_value_1', 'Property B' = 'some_value_2', 'Property C' = 'some_value_3', 'Property D' = 'some_value_4'}
36715 2016-08-25 17:59:50.964 INFO MOD_2_NAME STATUS {'Parameter 1' = 'some_value_a', 'Parameter 2' = 'some_value_b', 'Parameter 3' = 'some_value_c', 'Parameter 4' = 'some_value_d', 'Parameter 5' = 'some_value_e', 'Parameter 6' = 'some_value_f', 'Parameter 7' = 'some_value_g','Parameter 8' = some_value_h, 'Parameter 9' = 'some_value_i', 'Parameter 10' = 'some_value_j', 'Parameter 11' = '', 'Parameter 12' = 'some_value_k', 'Parameter 13' = 'some_value_l'}
36785 2016-08-25 18:59:50.051 INFO MOD_1_NAME STATUS {'Property A' = 'some_value_1', 'Property B' = 'some_value_2', 'Property C' = 'some_value_3', 'Property D' = 'some_value_4'}
36827 2016-08-25 19:01:50.964 INFO MOD_2_NAME STATUS {'Parameter 1' = 'some_value_a', 'Parameter 2' = 'some_value_b', 'Parameter 3' = 'some_value_c', 'Parameter 4' = 'some_value_d', 'Parameter 5' = 'some_value_e', 'Parameter 6' = 'some_value_f', 'Parameter 7' = 'some_value_g','Parameter 8' = some_value_h, 'Parameter 9' = 'some_value_i', 'Parameter 10' = 'some_value_j', 'Parameter 11' = '', 'Parameter 12' = 'some_value_k', 'Parameter 13' = 'some_value_l'}

所以基本上我希望所有行的 timemodule 列都具有相同的值,并且它们的 contents 列解析成字典。 (也可能有一些“缺失”或“空”的读数。)我不想过滤或删除数据,只是减少和总结它。

我猜想我需要给你一些 groupby()transform()apply() 的组合,但我不知道从哪里开始。

我的部分困难在于我无法检查 groupby() 的结果以查看它是否按照我的要求进行。

g1 = df.groupby(['module', 'time'])

g1 没有出现在 Spyder 变量浏览器中。 printing 不显示任何内容。我无法访问属性 index 或调用 g1 上的 info()。但我怀疑 groupby() 在这里是否值得......我不想消除任何东西。

一直在进行一些搜索以找到示例,但不断得到看似误报的结果。任何入门帮助都将不胜感激。

最佳答案

pv = df.set_index(['time', 'type', 'module', 'msg_type']) \
.content.str.extract(r"'(?P<prop>.+)' = (?P<val>.+)", expand=True)

pv.groupby(level=[0, 2]).apply(lambda df: df.set_index('prop').val.to_dict())

2016-08-25 17:59:50.051,MOD_1_NAME,"{'Property A': 'some_value_1', 'Property C': 'some_value_3', 'Property B': 'some_value_2', 'Property D': 'some_value_4'}"
2016-08-25 17:59:50.964,MOD_2_NAME,"{'Parameter 6': 'some_value_f', 'Parameter 7': 'some_value_g', 'Parameter 4': 'some_value_d', 'Parameter 5': 'some_value_e', 'Parameter 2': 'some_value_b', 'Parameter 3': 'some_value_c', 'Parameter 1': 'some_value_a', 'Parameter 8': 'some_value_h', 'Parameter 9': 'some_value_i', 'Parameter 10': 'some_value_j', 'Parameter 12': 'some_value_k', 'Parameter 13': 'some_value_l'}"
2016-08-25 18:59:50.051,MOD_1_NAME,"{'Property A': 'some_value_1', 'Property C': 'some_value_3', 'Property B': 'some_value_2', 'Property D': 'some_value_4'}"
2016-08-25 19:01:50.964,MOD_2_NAME,"{'Parameter 6': 'some_value_f', 'Parameter 7': 'some_value_g', 'Parameter 4': 'some_value_d', 'Parameter 5': 'some_value_e', 'Parameter 2': 'some_value_b', 'Parameter 3': 'some_value_c', 'Parameter 1': 'some_value_a', 'Parameter 8': 'some_value_h', 'Parameter 9': 'some_value_i', 'Parameter 10': 'some_value_j', 'Parameter 12': 'some_value_k', 'Parameter 13': 'some_value_l'}"

关于python - Pandas 数据归约与合并,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39421350/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com