python - 效率: Dropping rows with the same timestamp while still having the median of second column for that timestamp-6ren

python - 效率: Dropping rows with the same timestamp while still having the median of second column for that timestamp

转载作者：行者123 更新时间：2023-12-01 06:51:19

27

4

我想做的事:列“角度”每秒跟踪约 20 个角度(可能会有所不同)。但我的“时间”时间戳的精度只有 1 秒(因此总是大约 20 行具有相同的时间戳)(数据帧中的总行数超过 100 万行)。我的结果应该是一个新的数据帧，每行的时间戳都在变化。时间戳的角度应为该间隔内 ~20 个时间戳的中位数。

我的想法:我遍历行并检查时间戳是否已更改。如果是这样，我选择所有时间戳直到它发生变化，计算中位数，并将其附加到新的数据帧。尽管如此，我有很多大数据文件，我想知道是否有更快的方法来实现我的目标。

现在我的代码如下(见下文)。它并不快，我认为必须有更好的方法来使用 pandas/numpy (或其他东西？)来做到这一点。

a = 0
for i in range(1,len(df1.index)):
    if df1.iloc[[a],[1]].iloc[0][0]==df1.iloc[[i],[1]].iloc[0][0]:
        continue
    else:
        if a == 0:
            df_result = df1[a:i-1].median()
        else:
            df_result = df_result.append(df1[a:i-1].median(), ignore_index = True)
    a = i

最佳答案

您可以在此处使用groupby。下面，我制作了一个简单的虚拟数据框。

import pandas as pd
df1 = pd.DataFrame({'time': [1,1,1,1,1,1,2,2,2,2,2,2],
                   'angle' : [8,9,7,1,4,5,11,4,3,8,7,6]})

df1

  time  angle
0   1   8
1   1   9
2   1   7
3   1   1
4   1   4
5   1   5
6   2   11
7   2   4
8   2   3
9   2   8
10  2   7
11  2   6

然后，我们按时间戳进行分组，并取该组内角度列的中位数，并将结果转换为 pandas 数据帧。

df2 =  pd.DataFrame(df1.groupby('time')['angle'].median())
df2 = df2.reset_index()
df2

    time angle
0   1     6.0
1   2     6.5

关于python - 效率: Dropping rows with the same timestamp while still having the median of second column for that timestamp，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58990878/

27

4

0

文章推荐： python - like() 得到了意外的关键字参数 'story_id'

文章推荐： emacs - 如何在 Windows 上将 ssh 与 Emacs 一起使用

文章推荐： jquery - 为什么 .animate 在 IE 8 中这么慢

文章推荐： java - JPanel 组件放置

timestamp - KnexJS : How do you insert/update a timestamp field with current timestamp?
标题基本上说明了一切。我主要对更新案例感兴趣。假设我们正在尝试更新具有时间戳记字段的记录，并且我们希望将该字段设置为记录更新的时间戳记。有没有办法做到这一点？最佳答案经过一些实验，我找到了合适的
python - 'Timestamp' 对象没有属性 'timestamp'
我正在学习一门类(class)，其中我必须将日期转换为 unix 时间戳。 import pandas as pd df = pd.read_csv('file.csv') print type(df
sql - TIMESTAMP、TIMESTAMP with TIME ZONE 和 TIMESTAMP with LOCAL TIME ZONE 之间的区别
我在两个不同的数据库中运行了相同的语句:我的本地数据库和 Oracle Live SQL . CREATE TABLE test( timestamp TIMESTAMP DEFAULT SY
sql - TIMESTAMP、TIMESTAMP with TIME ZONE 和 TIMESTAMP with LOCAL TIME ZONE 之间的区别
我在两个不同的数据库中运行了相同的语句:我的本地数据库和 Oracle Live SQL . CREATE TABLE test( timestamp TIMESTAMP DEFAULT SY
python - bson.timestamp.Timestamp - 递增计数器是什么？
bson.timestamp.Timestamp需要两个参数:time 和 inc。 time 显然是存储在 Timestamp 中的时间值。什么是公司？它被描述为递增计数器，但它有什么用途呢？它应
php - 查询 where timestamp < timestamp 不起作用？
2016-08-18 04:52:14 是我从数据库中获取的时间戳，用于跟踪我想从哪里加载更多记录，这些记录小于该时间这是代码 foreach($explode as $stat){
timestamp - 如何转换Erlang :timestamp() to normal date format?
我想将 erlang:timestamp() 的结果转换为正常的日期类型，公历类型。普通日期类型表示“日-月-年，时:分:秒”。 ExampleTime = erlang:timeStamp(),
timestamp - 如何转换Erlang :timestamp() to normal date format?
我想将 erlang:timestamp() 的结果转换为正常的日期类型，公历类型。普通日期类型表示“日-月-年，时:分:秒”。 ExampleTime = erlang:timeStamp(),
java - 将 Timestamp 与另一个 Timestamp 对象进行比较
我是 Java 新手。我正在使用两个 Timestamp 对象 dateFrom和dateTo 。我想检查是否dateFrom比 dateTo早 45 天。我用这个代码片段来比较这个 if(dateF
python - 属性错误 : 'Timestamp' object has no attribute 'timestamp
在将 panda 对象转换为时间戳时，我遇到了这个奇怪的问题。 Train['date'] 值类似于 01/05/2014，我正在尝试将其转换为 linuxtimestamp。我的代码: Train
python - 属性错误 : 'Timestamp' object has no attribute 'timestamp'
我正在努力让我的代码运行。时间戳似乎有问题。您对我如何更改代码有什么建议吗？我看到之前有人问过这个问题，但没能成功。这是我在运行代码时遇到的错误:'Timestamp' object has no
timestamp - AWS 雅典娜 SYNTAX_ERROR : not a valid timestamp literal
我正在尝试运行以下查询: SELECT startDate FROM tests WHERE startDate BETWEEN TIMESTAMP '1555248497'
sql - 亚马逊雅典娜 : Convert bigint timestamp to readable timestamp
我正在使用 Athena 查询以 bigInt 格式存储的日期。我想将其转换为友好的时间戳。我试过了: from_unixtime(timestamp DIV 1000) AS readab
sql-server - SQLServer异常: The conversion from timestamp to TIMESTAMP is unsupported.
最近进行了一些数据库更改，并且 hibernate 映射出现了一些困惑。 hibernate 映射: ...other fields 成员模型对象: public class Mem
Pandas : How to get timestamp. 天和 timestamp.month 填充零
rng = pd.date_range('2016-02-07', periods=7, freq='D') print(rng[0].day) print(rng[0].month) 7 2 我想要
Pandas : How to get timestamp. 天和 timestamp.month 填充零
rng = pd.date_range('2016-02-07', periods=7, freq='D') print(rng[0].day) print(rng[0].month) 7 2 我想要
Android - Firebase ServerValue.TIMESTAMP 返回 "{.sv=timestamp}"
我必须在我的数据库中保存 ServerValue.TIMESTAMP 但它必须是一个字符串。当我键入 String.valueOf(ServerValue.TIMESTAMP); 或 ServerVa
PostgreSQL select now()::timestamp 不同于默认的 now()::timestamp
在我的程序中，每个表都有一列 last_modified: last_modified int8 DEFAULT (date_part('epoch'::text, now()::timestamp)
python - 将 pandas._libs.tslibs.timestamps.Timestamp 转换为日期时间
我想将此时间戳对象转换为日期时间此对象是在数据帧上使用 asfreq 后获得的这是最后一个索引 Timestamp('2018-12-01 00:00:00', freq='MS') 想要的输出 2
mysql - 如何查找上一条记录[n-per-group max(timestamp) < timestamp]？
我有一个包含时间序列传感器数据的大表。大型是指分布在被监控的各个 channel 中的从几千到 10M 的记录。对于某种传感器类型，我需要计算当前读数和上一个读数之间的时间间隔，即找到当前读数之前的最

首页

博学

6Ren·AI

商城

python - 效率: Dropping rows with the same timestamp while still having the median of second column for that timestamp