gpt4 book ai didi

python - sklearn Imputer() 返回的特征不适合拟合函数

转载 作者:太空宇宙 更新时间:2023-11-04 08:55:32 25 4
gpt4 key购买 nike

我有一个包含缺失值 NaN 的特征矩阵,因此我需要先初始化这些缺失值。但是,最后一行提示并抛出以下错误行:预期序列或类数组,得到 Imputer(axis=0, copy=True, missing_values='NaN', strategy='mean', verbose=0)。我查了一下,好像是因为train_fea_imputed不是np.array格式,而是sklearn.preprocessing.imputation.Imputer格式。我该如何解决这个问题?
顺便说一句,如果我使用 train_fea_imputed = imp.fit_transform(train_fea),代码工作正常,但 train_fea_imputed 返回一个比 train_fea 小一维的数组

    import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import Imputer

imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
train_fea_imputed = imp.fit(train_fea)

# train_fea_imputed = imp.fit_transform(train_fea)
rf = RandomForestClassifier(n_estimators=5000,n_jobs=1, min_samples_leaf = 3)
rf.fit(train_fea_imputed, train_label)

更新:我改成了

imp = Imputer(missing_values='NaN', strategy='mean', axis=1)

现在尺寸问题没有出现了。我认为插补函数存在一些固有问题。我完成项目后会回来。

最佳答案

使用 scikit-learn,初始化模型、训练模型和获得预测是独立的步骤。在你的情况下你有:

train_fea = np.array([[1,1,0],[0,0,1],[1,np.nan,0]])
train_fea
array([[ 1., 1., 0.],
[ 0., 0., 1.],
[ 1., nan, 0.]])

#initialise the model
imp = Imputer(missing_values='NaN', strategy='mean', axis=0)

#train the model
imp.fit(train_fea)

#get the predictions
train_fea_imputed = imp.transform(train_fea)
train_fea_imputed
array([[ 1. , 1. , 0. ],
[ 0. , 0. , 1. ],
[ 1. , 0.5, 0. ]])

关于python - sklearn Imputer() 返回的特征不适合拟合函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30584543/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com