pandas groupby agg function column/dtype error(PANDA GROUPAS BY AGG Function列/dtype错误)-6ren

pandas groupby agg function column/dtype error(PANDA GROUPAS BY AGG Function列/dtype错误)

转载作者：bug小助手更新时间：2023-10-24 23:45:13

30

4

I'm working through Python for Data Analysis, and I'm having problems with part of the Ch. 9 (Data Aggregation and Group Operations) section on "Grouping with Functions."

我正在使用Python进行数据分析，但我遇到了部分CH的问题。9(数据聚合和分组操作)部分，介绍“使用函数分组”。

Specifically, if I use the GroupBy object methods or, e.g., Numpy-defined functions, everything works fine. In particular, it ignores columns with strings and only operates on the (appropriate) numeric columns. However, if I try to define my own function to calculate some numeric output, it does not ignore the columns with strings, and it returns an Attribute Error.

具体地说，如果我使用GroupBy对象方法或Numpy定义的函数，一切都会正常工作。特别是，它忽略带有字符串的列，并且只对(适当的)数字列进行操作。但是，如果我尝试定义自己的函数来计算一些数字输出，它不会忽略带有字符串的列，并且会返回一个属性错误。

Here's the example I'm having trouble with:

下面是我遇到麻烦的一个例子：

df = DataFrame({'data1':np.random.randn(5),
                'data2':np.random.randn(5),
                'key1':['a','a','b','b','a'],
                'key2':['one','two','one','two','one']})

It works fine if I type either of these (I have numpy imported as np):

如果我输入这两项中的任何一项，它都可以正常工作(我已经将NumPy导入为np)：

df.groupby('key1').mean()

or

或

grouped = df.groupby('key1')

grouped.agg(np.mean())

But if I try these, I get errors ('peak_to_peak' is from the book):

但如果我尝试这些方法，我会得到错误信息(书中的“Peak_to_Peak”是这样的)：

def peak_to_peak(arr):
    return arr.max() - arr.min()

grouped.agg(peak_to_peak)

grouped.agg(lambda x: np.mean(x))

Trying 'peak_to_peak' gives me a big, long error that ends with:

尝试使用“Peak_to_Peak”会给出一个很大很长的错误，错误的结尾是：

TypeError: unsupported operand type(s) for -: 'str' and 'str'

Trying the lambda function with np.mean() gives me a big long error that ends with:

尝试将lambda函数与np.ean()一起使用时，我会得到一个很大的长期错误，以：

TypeError: Could not convert onetwoone to numeric

Trying other user-defined functions produces similar errors. In all these cases, it's pretty clearly trying to apply peak_to_peak() or np.mean() (or whatever) to the (subsets of the) 'key2' column from df, whereas for the built-in methods and predefined functions, it (correctly) ignores the 'key2' column subsets.

尝试其他用户定义函数会产生类似的错误。在所有这些情况下，很明显，它试图对df中的‘key2’列(子集)应用Peak_to_Peak()或np.ean()(或其他任何东西)，而对于内置方法和预定义函数，它(正确地)忽略了‘key2’列子集。

Any insights would be appreciated.

任何真知灼见都将不胜感激。

Update: It turns out if I pass 'peak_to_peak' or the lambda function as lists (e.g., grouped.agg([peak_to_peak])), it works fine. Note that this is not how it's presented in the book, nor are lists required for predefined functions. So, it's still confusing, but at least it's functional, I guess.

更新：事实证明，如果我以列表(例如，grouped.agg([Peak_to_Peak]))的形式传递‘PEAK_TO_PEAK’或lambda函数，它就能很好地工作。请注意，这不是书中介绍的方式，预定义函数也不需要列表。所以，它仍然令人困惑，但至少我想它是有功能的。

更多回答

What version of pandas are you using? On the latest master for .agg(lambda x: np.mean(x)) I get NaNs back in the key2 column. The documentation on agg doesn't mention this at all, and it should. Care to open an issue on github about this?

你用的是什么版本的熊猫？在.agg(lambda x：np.ean(X))的最新主服务器上，我在key2列中得到了nans。关于AGG的文档根本没有提到这一点，它应该提到这一点。介意在GitHub上就这一点打开一个问题吗？

I've got pandas 0.13.1 (and numpy 1.7.1 and python 2.7.6, for what those are worth). I didn't see any NaNs in mine... I'll look into opening an issue on github. Thanks for the response.

我有熊猫0.13.1(还有NumPy 1.7.1和Python2.7.6，这些都值)。我在我的车里没看到任何内裤。我会考虑在GitHub上开设一期。感谢您的回复。

This was a regression from prior to 0.13, not sure exactly when (the book is based on about 0.10 IIRC); fixed here. github.com/pydata/pandas/pull/6338; It should essentially ignore that column (and was just not catching the error)

这是从0.13之前的回归，不确定确切的时间(本书基于大约0.10 IIRC)；已在此处修正。Githorb.com/pydata/pandas/ull/6338；它基本上应该忽略该列(只是没有捕捉到错误)

优秀答案推荐

In the approach you use, you pass the columns as parameters to the function, one by one with all values. However, since there are non-numeric values in the key2 column, subtraction cannot be performed between two strings.

在您使用的方法中，将列作为参数传递给函数，一个接一个地传递所有值。但是，由于Key2列中存在非数字值，因此不能在两个字符串之间执行减法。

You can solve your problem as follows:

您可以通过以下方式解决您的问题：

grouped[["data1", "data2"]].agg(peak_to_peak)

grouped[["data1", "data2"]].agg(lambda x: np.mean(x))`

更多回答

30

4

0

文章推荐： Decode JSON using Dart Isolates(使用DART分离物解码JSON)

function - 命名空间::function cannot be used as a function
main.cpp #include "Primes.h" #include int main(){ std::string choose; int num1, num2; w
c - 为什么调用此函数会产生错误 " is not a function or function pointer"？
似乎函数 qwertyInches() 应该可以工作但是当我在 main() 中调用它时它给了我 [Error] called object 'qwertyInches' is not a funct
c++ - object.function().function().function().......这是如何工作的？
我无法理解 C++ 语法的工作原理。 #include using namespace std; class Accumulator{ private: int value; public:
function - dart 中的 Function() 和 Function 有什么区别？
在类中声明函数成员时，我们可以同时执行这两种操作； Function first; Function() second; 它们之间有什么区别？最佳答案 Function 代表任意函数: void
jquery错误: a function is not a function?
“colonna”怎么可能是一个简单的字符串: $('td.' + colonna).css('background-color','#ffddaa'); 可以正确突出显示有趣单元格的背景，并且: $
javascript - 如何将传递到 function() 的动态参数中继到 function() 中调用的 function()
我正在尝试将网页中的动态参数中继到函数中，然后函数将它们传递给函数内部的调用。比如下面这个简化的代码片段，现在这样，直接传入参数是没有问题的。但是，如何在不为每个可能的 colorbox 参数设置 s
C++ Lambdas : function that returns a function that returns a function . ..？
C++ 中是否有一种模式允许您返回一个函数，它返回一个函数本身。例如 std::function func = ...; do { func = func(); } while (func);
c - 错误 : function declared as function returning function
我正在将 Windows 程序集移植到 Linux。我有一些代码要移植。我实际上是 linux 中 C 的新手。我知道 C 基础知识是一样的! typedef struct sReader {
javascript - 何时使用 function() 、 function 或 () => function(callback)
我一直在寻找一个很好的解释，所以我很清楚。示例: this.onDeleteHandler(index)}/> 对比对比 this.nameChangedhandler(event, perso
javascript - 为什么 function(){}.__proto__ === Function.prototype 和 Function.prototype === function(){}.__proto_ 返回不同的结果
function(){}.__proto__ === Function.prototype 和 Function.prototype === function(){}.__proto__ 得到不同的结
javascript - 'Function' 上的 MDN 描述感到困惑，Function.length 是 Function 或 Function.prototype 的属性
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Function 据说 Propert
function - Excel VBA : Special Types - Functions as Arguments of Functions
VBA 中的函数没有特殊类型。我很难理解如何在 Excel VBA 中将函数作为参数添加到函数中。我想要完成的是这样的事情: function f(g as function, x as strin
r - Tidyeval in own functions in own functions inside own functions with the pipe 管道
所以我正在尝试制作一个包(我没有在下面包含我的 roxygen2 header ): 我有这个功能: date_from_text % dplyr::mutate(!!name := lubr
c++ - 从 std::function 继承构造函数时为 "function returning a function"
尝试从 std::function 派生一个类，对于初学者来说，继承构造函数。这是我的猜测: #include #include using namespace std; template cla
javascript - 错误: function is not defined when calling a function returned by another function
我正在尝试编写一个返回另一个函数的函数。我的目标是编写一个函数，它接受一个对象并返回另一个函数“search”。当我使用键调用搜索函数时，我想从第一个函数中给定的对象返回该键的值。 propertyO
functional-programming - "Functional programming"有明确的含义，但是 "functional language"吗？
我非常清楚函数式编程技术和命令式编程技术之间的区别。但是现在有一种普遍的趋势是谈论“函数式语言”，这确实让我感到困惑。当然，像 Haskell 这样的一些语言比 C 等其他语言更欢迎函数式编程。但即
JavaScript美学: "function foo() {}" vs "var foo = function() {};" in AMD functions
关闭。这个问题是opinion-based 。目前不接受答案。想要改进这个问题吗？更新问题，以便 editing this post 可以用事实和引文来回答它。 . 已关闭 8 年前。 Improv
javascript - Function.call、Function.prototype.call、Function.prototype.call.call 和 Function.prototype.call.call.call 之间的区别
我在stackoverflow上查过很多类似的问题，比如call.call 1 , call.call 2 ，但我是新人，无法发表任何评论。我希望我能找到关于 JavaScript 解释器如何执行这些
google-cloud-functions - 从 Cloud Function 本身获取 Cloud Function 名称
向 Twilio 发送 SMS 时，Twilio 会向指定的 URL 发送多个请求，以通过 Webhook 提供该 SMS 传送的状态。我想让这个回调异步，所以我开发了一个 Cloud Functio
azure-functions - 如何获取使用 Terraform 部署的 Function-App 中的 "Function Url"？
作为 IaC 的一部分，A 功能应用，让我们将其命名为 FuncAppX 是使用 Terraform 部署的，它有一个内置函数。我需要使用 Terraform 在函数应用程序中访问相同函数的 Ur

首页

博学

6Ren·AI

商城

pandas groupby agg function column/dtype error(PANDA GROUPAS BY AGG Function列/dtype错误)