Python 数据框 : cumulative sum of column until condition is reached and return the index-6ren

Python 数据框 : cumulative sum of column until condition is reached and return the index

转载作者：太空狗更新时间：2023-10-29 18:28:38

24

4

我是 Python 的新手，目前面临一个我无法解决的问题。我真的希望你能帮助我。英语不是我的母语，所以如果我不能正确表达自己，我很抱歉。

假设我有一个包含两列的简单数据框:

index  Num_Albums  Num_authors
0      10          4
1      1           5
2      4           4
3      7           1000
4      1           44
5      3           8

Num_Abums_tot = sum(Num_Albums) = 30

我需要对 Num_Albums 中的数据进行累加，直到达到某个条件。注册满足条件的索引，并从Num_authors中获取对应的值。

例子:Num_Albums 的累积总和，直到总和等于 30 的 50% ± 1/15 (--> 15±2):

10 = 15±2? No, then continue;
10+1 =15±2? No, then continue
10+1+41 = 15±2? Yes, stop.

在索引 2 处达到条件。然后在该索引处获取 Num_Authors:Num_Authors(2)=4

在我开始考虑如何使用 while/for 循环实现它之前，我想看看是否已经在 pandas 中实现了一个功能....

[我想指定要从中检索相关索引值的列(当我有例如 4 列并且我想对第 1 列中的元素求和时，这会派上用场，条件达到 =yes 然后得到第 2 列中对应的值；然后对第 3 列和第 4 列执行相同的操作)]。

最佳答案

选择 - 1:

您可以使用 cumsum 计算累计和.然后使用 np.isclose使用它的内置公差参数来检查该系列中存在的值是否位于指定的阈值 15 +/- 2 内。这将返回一个 bool 数组。

通过np.flatnonzero ，返回 True 条件成立的索引的序数值。我们选择 True 值的第一个实例。

最后，使用.iloc根据之前计算的索引获取你需要的列名的值。

val = np.flatnonzero(np.isclose(df.Num_Albums.cumsum().values, 15, atol=2))[0]
df['Num_authors'].iloc[val]      # for faster access, use .iat 
4

在 series 上执行 np.isclose 后转换为数组时:

np.isclose(df.Num_Albums.cumsum().values, 15, atol=2)
array([False, False,  True, False, False, False], dtype=bool)

选择 - 2:

使用pd.Index.get_loc在 cumsum 计算系列上，它还支持 nearest 方法上的 tolerance 参数。

val = pd.Index(df.Num_Albums.cumsum()).get_loc(15, 'nearest', tolerance=2)
df.get_value(val, 'Num_authors')
4

选项 - 3:

使用idxmax为 cumsum 上的 sub 和 abs 操作后创建的 bool 掩码找到 True 值的第一个索引系列:

df.get_value(df.Num_Albums.cumsum().sub(15).abs().le(2).idxmax(), 'Num_authors')
4

关于Python 数据框 : cumulative sum of column until condition is reached and return the index，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41488676/

24

4

0

文章推荐： python - 如何在 Matplotlib 中编写自己的 LaTeX 序言？

文章推荐： angular - 我如何在我的 Angular 项目中使用 pnpm 来管理包？

文章推荐： javascript - 在 Angular 6 中一一处理多个 http 响应

文章推荐： python - 如何修复 "plural forms could be dangerous"django 错误？

javascript - Angular 1 : multiple conditions with multiple conditions OR how to exclude conditions if other conditions are true
现在我已经创建了一个额外的跨度来容纳一个条件。 568 || subKey == 0" ng-repeat="links in linksWrap.links">
Excel公式: If condition then do that condition
一些 excel IF 语句可能会变得相当长，我正在寻找一种更简单的方法来编写它们。例如，如果我要写: If($B$4+13=7,$B$4+13,FALSE) 我认为它会更容易说: If($B$4+1
php - 如何编写多个条件为 true 的 php If 语句(Condition#1=true、Condition#2=true、Condition#3=true)
我有一个包含 FromDate 、 ToDate 、 VendorName 和 GoodsName 的表单，一旦一切为真，我需要显示结果示例: FromDate="11/20/2019"、ToDat
javascript - if(!!condition) 和 if(condition) 有什么区别
我经常看到使用 !!condition 而不仅仅是常规条件的代码。即: if(!!value){ doSomething(); } 对比: if(value){ doSomething
java - if(condition) else or if(condition)，使用break时性能有区别吗？
这个问题有点模棱两可，这两个在汇编代码/性能方面是否等效: public void example{ do{ //some statements; if(condition)
c# - Where(condition).Any() 和 Any(condition) 是否等价
在我看到的使用 Any 方法的 Linq 查询示例中，大约有一半是通过将其应用于 Where() 调用的结果来实现的，另一半则直接将其应用于集合。这两种样式是否总是等效的，或者在某些情况下它们可能会返
c - 为什么使用 !!(condition) 而不是 (condition)？
这个问题在这里已经有了答案: What does !!(x) mean in C (esp. the Linux kernel)? (3 个答案) 关闭 9 年前。我见过人们使用带有两个 '!'
java - 线程转储 : How to see the condition of waiting/or any other condition?
我对部署在生产环境中的应用程序进行了线程转储，该应用程序使用 logback。我不是分析线程转储的专家，但是，我必须这样做。正在学习，网上也看了一些文章。下面是真正的线程转储: "logback-8
SQL: "condition is not true"模式替代 "is null or not (condition)"
在 SQL 中(特别是 Postgres): 子句 where not foo='bar' in case foo is null 评估为某种 null，导致该行不是包含在结果中。另一方面，子句 w
mysql - Condition with join 类似于 where condition after join
是不是类似于has and condition with join和where condition after join？例如对于以下两个查询，它会给我相同的结果吗 1) SELECT COUNT
c++ - 为什么 { } while(condition);末尾需要分号但 while(condition) {} 不需要？
按照目前的情况，这个问题不适合我们的问答形式。我们希望答案得到事实、引用或专业知识的支持，但这个问题可能会引发辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开，visit the
c - 样式问题 !condition agains condition == NULL
如果您调用某个函数，并且该函数在发生错误时返回 NULL(例如，想想 malloc() 或 fopen())，两个更好: FILE *fp = fopen(argv[0], "r"); if (fp
Azure 数据工厂 V2 - If Condition 事件不能包含另一个 If Condition 事件
我正在使用 Azure 数据工厂 V2，我需要在父检查验证中实现两级检查。例如:如果条件一为真，那么我需要检查条件 2。并且，如果条件 2 为真，则检查条件 3。这是一种分层检查。当我在父 IF 条
linq-to-entities - .Where().FirstOrDefault() vs .FirstOrDefault()
使用 Linq to Entities 有以下区别吗？ db.EntityName.Where(a => a.Id == id).FirstOrDefault(); db.EntityName.Fir
sql - WHERE 子句中的 "Conditional Conditions"(应用哪个条件取决于 "mode"标志)
我有一种情况，我已经用两种不同的方式解决了，但想知道人们对这些选项的看法，以及他们是否有其他选择...... 系统正在处理数据的“间隔”。所有数据都分配到一个“区间” 该间隔由事实表中的“inte
powerbi - 电源 BI : Multiple condition in single if condition
我有包含字段 Amount, Condition1, Condition2 的表格。例子: Amount Condition1 Condition2 ---------------------
java - condition in jsp executes all conditions
我正在尝试在 Netbeans 中制作一个简单的 MySQL、Java JDBC Web 应用程序。我希望根据当前 session 中的状态变量显示不同的内容。我尝试了以下方法: 首先，我在 .jsp
conditional-statements - smarty tags和css condition tags一样，请问如何解决？
我想为 postnuke cms 设计一个主题。并希望在模板文件中使用 css 条件。 postnuke 使用类似 smarty 的标签 .... 所以当我使用 .... 它给出了一些关于标签的错误
python - asyncio.Condition 中的锁除了兼容 threading.Condition 之外还有其他用途吗？
我想问一下asyncio.Condition .我对这个概念并不熟悉，但我从学生时代就知道并了解锁、信号量和队列。我找不到很好的解释或典型的用例，只是 this example .我看了看来源。核心
mysql - SQL : Conditional result used in the same conditional outputs
我想知道如何在不在语句中重做相同查询两次的情况下处理 SQL 比较。这是我要找的: SELECT columnName10, IF( SELECT columnName20 FROM Othe

首页

博学

6Ren·AI

商城

Python 数据框 : cumulative sum of column until condition is reached and return the index