python - 值错误: invalid fill value with a <class 'pandas.core.frame.DataFrame' >-6ren

python - 值错误: invalid fill value with a

转载作者：行者123 更新时间：2023-11-30 08:54:06

我正在练习贷款预测练习问题，并尝试填充数据中的缺失值。我从here获取数据。为了完成这个问题，我遵循这个tutorial .

您可以找到我正在使用的完整代码(文件名 model.py)和数据 here在 GitHub 上。

数据框看起来像这样:

df[['Loan_ID', 'Self_Employed', 'Education', 'LoanAmount']].head(10)
Out: 
    Loan_ID Self_Employed     Education  LoanAmount
0  LP001002            No      Graduate         NaN
1  LP001003            No      Graduate       128.0
2  LP001005           Yes      Graduate        66.0
3  LP001006            No  Not Graduate       120.0
4  LP001008            No      Graduate       141.0
5  LP001011           Yes      Graduate       267.0
6  LP001013            No  Not Graduate        95.0
7  LP001014            No      Graduate       158.0
8  LP001018            No      Graduate       168.0
9  LP001020            No      Graduate       349.0

最后一行执行后(对应model.py文件中的第60行)

url = 'https://raw.githubusercontent.com/Aniruddh-SK/Loan-Prediction-Problem/master/train.csv'
df = pd.read_csv(url) 
df['LoanAmount'].fillna(df['LoanAmount'].mean(), inplace=True)
df['Self_Employed'].fillna('No',inplace=True)

table = df.pivot_table(values='LoanAmount', index='Self_Employed' ,columns='Education', aggfunc=np.median)
# Define function to return value of this pivot_table
def fage(x):
 return table.loc[x['Self_Employed'],x['Education']]
# Replace missing values
df['LoanAmount'].fillna(df[df['LoanAmount'].isnull()].apply(fage, axis=1), inplace=True)

我收到此错误:

ValueError                                Traceback (most recent call last)
<ipython-input-40-5146e49c2460> in <module>()
----> 1 df['LoanAmount'].fillna(df[df['LoanAmount'].isnull()].apply(fage, axis=1), inplace=True)

/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in fillna(self, value, method, axis, inplace, limit, downcast, **kwargs)
   2368                                           axis=axis, inplace=inplace,
   2369                                           limit=limit, downcast=downcast,
-> 2370                                           **kwargs)
   2371 
   2372     @Appender(generic._shared_docs['shift'] % _shared_doc_kwargs)

/usr/local/lib/python2.7/dist-packages/pandas/core/generic.pyc in fillna(self, value, method, axis, inplace, limit, downcast)
   3264                 else:
   3265                     raise ValueError("invalid fill value with a %s" %
-> 3266                                      type(value))
   3267 
   3268                 new_data = self._data.fillna(value=value, limit=limit,

ValueError: invalid fill value with a <class 'pandas.core.frame.DataFrame'>

如何填充缺失值而不出现此错误？

最佳答案

教程的作者似乎想用 table 的值替换 NaN。

但需要先通过 unstack 创建系列和 set_index用于对齐数据。

首先删除用 mean 替换为 NaN:

url='https://raw.githubusercontent.com/Aniruddh-SK/Loan-Prediction-Problem/master/train.csv'

df = pd.read_csv(url) #Reading the dataset in a dataframe using Pandas

#df['LoanAmount'].fillna(df['LoanAmount'].mean(), inplace=True)

df['Self_Employed'].fillna('No',inplace=True)

<小时/>

table = df.pivot_table(values='LoanAmount', 
                       index='Self_Employed', 
                       columns='Education', 
                       aggfunc=np.median)

print (table.unstack())
Education     Self_Employed
Graduate      No               130.0
              Yes              157.5
Not Graduate  No               113.0
              Yes              130.0
dtype: float64

<小时/>

#check all values with NaN in LoanAmount column
print (df.loc[df['LoanAmount'].isnull(), ['Self_Employed','Education', 'LoanAmount']])
    Self_Employed     Education  LoanAmount
0              No      Graduate         NaN
35             No      Graduate         NaN
63             No      Graduate         NaN
81            Yes      Graduate         NaN
95             No      Graduate         NaN
102            No      Graduate         NaN
103            No      Graduate         NaN
113           Yes      Graduate         NaN
127            No      Graduate         NaN
202            No  Not Graduate         NaN
284            No      Graduate         NaN
305            No  Not Graduate         NaN
322            No  Not Graduate         NaN
338            No  Not Graduate         NaN
387            No  Not Graduate         NaN
435            No      Graduate         NaN
437            No      Graduate         NaN
479            No      Graduate         NaN
524            No      Graduate         NaN
550           Yes      Graduate         NaN
551            No  Not Graduate         NaN
605            No  Not Graduate         NaN

<小时/>

#for check get all indexes where NaNs
idx = df.loc[df['LoanAmount'].isnull(), ['Self_Employed','Education', 'LoanAmount']].index
print (idx)
Int64Index([  0,  35,  63,  81,  95, 102, 103, 113, 127, 202, 284, 305, 322,
            338, 387, 435, 437, 479, 524, 550, 551, 605],

# Replace missing values
df = df.set_index(['Education','Self_Employed'])
df['LoanAmount'].fillna(table.unstack(), inplace=True)
df = df.reset_index()

<小时/>

#check output - filter only indexes where NaNs before
print (df.loc[df.index.isin(idx), ['Self_Employed','Education', 'LoanAmount']])
    Self_Employed     Education  LoanAmount
0              No      Graduate       130.0
35             No      Graduate       130.0
63             No      Graduate       130.0
81            Yes      Graduate       157.5
95             No      Graduate       130.0
102            No      Graduate       130.0
103            No      Graduate       130.0
113           Yes      Graduate       157.5
127            No      Graduate       130.0
202            No  Not Graduate       113.0
284            No      Graduate       130.0
305            No  Not Graduate       113.0
322            No  Not Graduate       113.0
338            No  Not Graduate       113.0
387            No  Not Graduate       113.0
435            No      Graduate       130.0
437            No      Graduate       130.0
479            No      Graduate       130.0
524            No      Graduate       130.0
550           Yes      Graduate       157.5
551            No  Not Graduate       113.0
605            No  Not Graduate       113.0

编辑:

更好的解决方案是 groupby与 apply其中将 NaN 替换为 median:

url='https://raw.githubusercontent.com/Aniruddh-SK/Loan-Prediction-Problem/master/train.csv'

df = pd.read_csv(url) #Reading the dataset in a dataframe using Pandas

#df['LoanAmount'].fillna(df['LoanAmount'].mean(), inplace=True)

df['Self_Employed'].fillna('No',inplace=True)


print (df.loc[df['LoanAmount'].isnull(), ['Self_Employed','Education', 'LoanAmount']])
    Self_Employed     Education  LoanAmount
0              No      Graduate         NaN
35             No      Graduate         NaN
63             No      Graduate         NaN
81            Yes      Graduate         NaN
95             No      Graduate         NaN
102            No      Graduate         NaN
103            No      Graduate         NaN
113           Yes      Graduate         NaN
127            No      Graduate         NaN
202            No  Not Graduate         NaN
284            No      Graduate         NaN
305            No  Not Graduate         NaN
322            No  Not Graduate         NaN
338            No  Not Graduate         NaN
387            No  Not Graduate         NaN
435            No      Graduate         NaN
437            No      Graduate         NaN
479            No      Graduate         NaN
524            No      Graduate         NaN
550           Yes      Graduate         NaN
551            No  Not Graduate         NaN
605            No  Not Graduate         NaN

<小时/>

idx = df.loc[df['LoanAmount'].isnull(), ['Self_Employed','Education', 'LoanAmount']].index
print (idx)
Int64Index([  0,  35,  63,  81,  95, 102, 103, 113, 127, 202, 284, 305, 322,
            338, 387, 435, 437, 479, 524, 550, 551, 605],
           dtype='int64')

# Replace missing values
df['LoanAmount'] = df.groupby(['Education','Self_Employed'])['LoanAmount']
                     .apply(lambda x: x.fillna(x.median()))

<小时/>

print (df.loc[df.index.isin(idx), ['Self_Employed','Education', 'LoanAmount']])
    Self_Employed     Education  LoanAmount
0              No      Graduate       130.0
35             No      Graduate       130.0
63             No      Graduate       130.0
81            Yes      Graduate       157.5
95             No      Graduate       130.0
102            No      Graduate       130.0
103            No      Graduate       130.0
113           Yes      Graduate       157.5
127            No      Graduate       130.0
202            No  Not Graduate       113.0
284            No      Graduate       130.0
305            No  Not Graduate       113.0
322            No  Not Graduate       113.0
338            No  Not Graduate       113.0
387            No  Not Graduate       113.0
435            No      Graduate       130.0
437            No      Graduate       130.0
479            No      Graduate       130.0
524            No      Graduate       130.0
550           Yes      Graduate       157.5
551            No  Not Graduate       113.0
605            No  Not Graduate       113.0

编辑:

还有一个问题:

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

解决方案是替换 NaNs:

df['Loan_Status'].fillna('No',inplace=True)
df['Credit_History'].fillna(0,inplace=True) 

outcome_var = 'Loan_Status'
model = LogisticRegression()
predictor_var = ['Credit_History']

classification_model(model, df, predictor_var,outcome_var)

关于python - 值错误: invalid fill value with a <class 'pandas.core.frame.DataFrame' >，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44450725/

文章推荐： java - eclipse 错误 : java was started but returned exit code=1

文章推荐： python-3.x - 发现输入变量样本数量不一致: [100, 300]

文章推荐： python - 预测 tensorflow 模型

ios - 应用提交 : Invalid Binary - Invalid Signature
我正在尝试向 iOS 应用商店提交更新。我要从 Buzztouch 应用程序转到 Sprite Kit 应用程序。我能够存档 Xcode 项目并提交它。该应用程序的状态为“上传已收到”，但大约一分钟后
xcode - 无法读取序列化诊断文件 : Invalid File: Invalid diagnostics signature
我收到了这个奇怪的警告。我不确定是什么原因造成的。 .dia文件扩展名应该表示核心有向图图形文件。我没有添加，应用程序几乎没有用户界面。最佳答案我对这个答案并不满意，但我认为它可以帮助人们，直到找
wpf - UriFormatException : Invalid URI: Invalid port specified
下面用作 Uri 参数的程序集限定字符串在 XAML 中工作，但在代码中使用时会出现错误。我尝试了各种 UriKind，结果都相同。我该如何解决这个问题？ [Test] public void La
css - Angular 7 : ng-invalid vs :invalid
我正在开发一个 Angular 应用程序，目的是将其部署到移动设备和 Web 浏览器上。设置表单样式以显示无效输入时，我应该定位 Angular“ng-invalid”类还是 HTML5“:inval
java.net.SocketException : Invalid request: Invalid how 异常
我有一个在 Google App Engine 上运行的应用程序，它是 Android 应用程序的后端。它基本上是 Android 应用程序和在我自己的服务器上运行的 MySQL 数据库之间的桥梁。
ios - 当我已经更新数据时出现错误 "Invalid update: invalid number of rows"
我的代码是这样的: func tableView(_ tableView: UITableView, commit editingStyle: UITableViewCellEditingStyle,
JWE Invalid Invalid Initialization Vector length(JWE无效初始化矢量长度无效)
I need to encrypt using Python with the A256GCM algorithm, and getting back a JWT that I need to
javascript - 网络包 : Invalid configuration object/Invalid Module Entry
无法成功编译webpack并生成bundle.js文件。据我了解，我的 src_dir 和 dist_dir 变量能够指向正确的路径，但在尝试编译时我仍然始终收到两个错误之一。配置对象无效。 Web
regexp_matches - 错误 : invalid regular expression: quantifier operand invalid
因此，当我在 postgres 上运行 regexp_matches 时收到一条错误消息，并且无法弄清楚如何通过它。它似乎在 regex101 等 reg_exp 测试站点上运行良好，但不幸的是在实际
java - LDAP异常 : Invalid Credentials (49) Invalid Credentials with grails
这些是我正在使用的导入: import com.novell.ldap.*; import java.io.UnsupportedEncodingException; 我正在尝试进行一个非常简单的密码
python - Pylint 消息 : Invalid constant name (invalid-name)
在记录器函数的简写情况下，Pylint 提示 Invalid constant name "myprint"(invalid-name)。 # import from utils import get
regex - 为什么和我得到: “Invalid regular expression. Uncaught SyntaxError. Invalid escape.” ?
我试图创建一个HTML输入标签，该标签仅接受以2种格式之一输入的数字，并拒绝所有其他输入。我只想接受以下格式的数字，包括破折号: 1234-12 和 1234-12-12 注意:不是日期，而是合法的
css - :focus:required:invalid:focus and :focus:required:invalid?有什么区别
我一直在尝试使用 Bootstrap 的表单样式处理 AngularJS 的电子邮件验证，并遇到了这个 CSS block 。 input:focus:required:invalid, textar
c - 为什么我使用以下代码从 valgrind 获取 "invalid read"和 "invalid write"？
我正在编写一个程序，以确保我了解如何在 C 中正确实现单向链表。我目前正在哈佛的 CS50 类(class)中学习，并且使用本教程，因为 CS50 人员不解释链接详细列出数据结构:https://ww
ios - 上传应用图片 : "Invalid GeoJSON: Your routing app coverage file is invalid."
此问题与询问同一消息的另一个问题不重复，但在另一个上下文中。这个问题的上下文只是关于上传截图图像和获取消息。今天，我在将图片上传到 App Store Connect 时收到一条新消息: Inval
ios - 尝试删除表中的行时出现错误 'Invalid update: invalid number of rows in section 0'
我的代码似乎运行良好，但当我滑动以删除 UITableView 中的一行时，应用程序崩溃并显示以下内容: 错误 LittleToDoApp[70390:4116002] *** Terminating
getting a `InValid URL` when I send a voice message(当我发送语音消息时收到`Invalid URL`)
当我尝试发送语音消息时，总是收到无效的url错误。我正在使用Whisper将音频转换为文本，但由于某种原因，我似乎无法将文件传递给Whisper。当我在Java脚本中使用它而不是在TypeScrip中
unit-testing - flutter 单元测试 :Invalid argument (string): Contains invalid characters
我正在尝试在 flutter 上对 http 客户端进行单元测试。在模拟 http 和我的存储库类之后: void main() { MockHttpCLient mockHttpCLient;
haskell - 使用 pandoc 作为库时，什么可能导致 "commitAndReleaseBuffer: invalid argument (invalid character)"？
我正在使用 pandoc 作为一个库，相关的代码片段是: module Lib ( latexDirToTex, latexToTxt ) where import qualified
ruby-on-rails - 设计 “Sign In”表单错误地显示 “Invalid Invalid email or password”错误消息
我正在开发一个(相对简单的)Rails应用程序。我正在使用Devise gem处理用户 session 。每当我导航到localhost:3000/users/sign_in时，我都会看到Devise

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 值错误: invalid fill value with a