gpt4 book ai didi

Python Sklearn - 随机森林和缺失值

转载 作者:太空狗 更新时间:2023-10-30 01:22:10 25 4
gpt4 key购买 nike

我正在尝试在包含缺失值的数据集上执行 RandomForest。

我的数据集看起来像:

train_data = [['1' 'NaN' 'NaN' '0.0127034' '0.0435092']
['1' 'NaN' 'NaN' '0.0113187' '0.228205']
['1' '0.648' '0.248' '0.0142176' '0.202707']
...,
['1' '0.357' '0.470' '0.0328121' '0.255039']
['1' 'NaN' 'NaN' '0.00311825' '0.0381745']
['1' 'NaN' 'NaN' '0.0332604' '0.2857']]

为了估算“NaN”值,我使用:

from sklearn.preprocessing import Imputer

imp=Imputer(missing_values='NaN',strategy='mean',axis=0)
imp.fit(train_data[0::,1::])
new_train_data=imp.transform(train_data)

但我收到以下错误:

Traceback (most recent call last):
File "./RandomForest.py", line 72, in <module>
new_train_data=imp.transform(train_data)
File "/home/aurore/.local/lib/python2.7/site-packages/sklearn/preprocessing /imputation.py", line 388, in transform
values = np.repeat(valid_statistics, n_missing)
File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 343, in repeat
return repeat(repeats, axis)
ValueError: a.shape[axis] != len(repeats)

我做到了:

new_train_data = imp.fit_transform(train_data)

然后我得到这个错误:

Traceback (most recent call last):
File "./RandomForest.py", line 82, in <module>
forest = forest.fit(train_data[0::,1::],train_data[0::,0])
File "/home/aurore/.local/lib/python2.7/site-packages/sklearn/ensemble/forest.py", line 224, in fit
X, = check_arrays(X, dtype=DTYPE, sparse_format="dense")
File "/home/aurore/.local/lib/python2.7/site-packages/sklearn/utils/validation.py", line 283, in check_arrays
_assert_all_finite(array)
File "/home/aurore/.local/lib/python2.7/site-packages/sklearn/utils/validation.py", line 43, in _assert_all_finite
" or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

包裹有问题吗?有人可以帮帮我吗?什么意思?

最佳答案

您在 1:: 列上训练插补器,然后尝试将其应用于所有列。那是行不通的。做

new_train_data = imp.fit_transform(train_data)

关于Python Sklearn - 随机森林和缺失值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25481066/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com