gpt4 book ai didi

python - 如何将随机森林中选定的特征转换为新列表

转载 作者:行者123 更新时间:2023-11-30 08:56:37 25 4
gpt4 key购买 nike

我正在研究回归问题。对于我的模型,我使用随机森林分类器进行降维。输出是一个以空格分隔的 bool 值字符串,它将好的功能突出显示为“True”。它看起来像这样:

[ True  True  True  True  True  True  True  True  True  True  True  True
True True True False True True False True True True False True
True True True True True True True False True False False True
True False False False False False False False False False False True
False False True False False False False False False True False False
False True False False False False False False False False False False
False False False False False False False False False False False False
False False False False False False False False False False False False
False False False False False False False False False False False False
False False False False False False True False False True False False
False True False True False False False False False False False False
False False False False False False False False False False False False
False True False False False False False False False False True False
False False False False False True False False False True True False
False False False False False False False False False False False False
False False False False False False True False False False False False
False False True False False True False True False True False False
True False False False False False False False False False False False
False False False True False True False True False False False False
False False False False False True True False False False False False
False False False False True False True True False True False False
False False False True True True False False False False False False
False False False False False False False False False False False False
False False False False False False True False False False False False
False False False False False False False False True False False False
False True False]

所以我所做的就是将其转换为逗号分隔的列表,如下所示:

[True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, False, True, True, True, False, True, True, True, True, True, True, True, True, False, True, False, False, True, True, False, False, False, False, False, False, False, False, False, False, True, False, False, True, False, False, False, False, False, False, True, False, False, False, True, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, True, False, False, False, True, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, True, False, False, False, True, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, True, False, False, True, False, True, False, True, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, True, False, True, False, False, False, False, False, False, False, False, False, True, True, False, False, False, False, False, False, False, False, False, True, False, True, True, False, True, False, False, False, False, False, True, True, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False]

然后循环遍历每个元素并检索相应的测试列。这是此过程的完整代码:

sel = SelectFromModel(RandomForestClassifier(n_estimators = 100), threshold = '1.25*mean')
sel.fit(x_train, y_train)

selected = sel.get_support()
selected_list = list(selected)
columns_list = []

for i in range(len(selected_list)):
if(selected_list[i] == 'True'):
columns_list.append(test[i])

print(columns_list)

但现在我得到一个空列表,尽管我尝试将其附加到我的 columns_list 中。基本上,我的目标是在我的预测中使用降维的列。我正在使用线性回归来解决这个问题。

更新

当我将代码更改为以下建议时,出现以下错误:

Traceback (most recent call last):
File "/opt/anaconda/envs/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2890, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/onur/Documents/Boston-Kaggle/Model.py", line 100, in <module>
columns_list.append(test[i])
File "/opt/anaconda/envs/lib/python3.7/site-packages/pandas/core/frame.py", line 2975, in __getitem__
indexer = self.columns.get_loc(key)
File "/opt/anaconda/envs/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2892, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

最佳答案

您的问题在这里:

if(selected_list[i] == 'True'):
columns_list.append(test[i])

您正在将 bool 值与字符串值进行比较'True'而不是True

一个紧凑且Pythonic的解决方案是:

 if selected_list[i]:
columns_list.append(test[i])

对于第二个错误,那是因为您正在使用 [] 访问数据帧 test。您需要使用方法.iloc

对于用法,这取决于测试包含的内容:

test.iloc[0] # first row of data frame- Note a Series data type output.
test.iloc[1] # second row of data frame
test.iloc[-1] # last row of data frame
# Columns:
test.iloc[:,0] # first column of data frame
test.iloc[:,1] # second column of data frame
test.iloc[:,-1] # last column of data frame

编辑,更明确的解决方案:

columns_selected = test.iloc[:, [i for i in range(len(selected_list)) if selected_list[i]]]

关于python - 如何将随机森林中选定的特征转换为新列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57931454/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com