python - Python : ValueError: could not convert string to float: 'Isolated' when reading input file for applying random forest-6ren

python - Python : ValueError: could not convert string to float: 'Isolated' when reading input file for applying random forest

转载作者：行者123 更新时间：2023-12-02 10:56:26

我正在尝试将随机森林应用于以下输入文件:

gold,Program,Requirement,MethodType,Top,Side,CallersT,CallersN,CallersU,CallersCallersT,CallersCallersN,CallersCallersU,CalleesT,CalleesN,CalleesU,CalleesCalleesT,CalleesCalleesN,CalleesCalleesU
T,chess,1,Inner,T,T,Low,-1,-1,Low,-1,-1,High,-1,-1,-1,-1,Low,
N,chess,2,Inner,N,N,-1,Low,-1,-1,Low,-1,-1,High,-1,-1,-1,Low,
N,chess,3,Inner,N,N,-1,Low,-1,-1,Low,-1,-1,High,-1,-1,-1,Low,
N,chess,4,Root,N,N,-1,Low,-1,-1,Low,-1,-1,High,-1,-1,-1,Low,
N,chess,5,Inner,N,N,-1,Low,-1,-1,Low,-1,-1,High,-1,-1,-1,Low,
N,chess,6,Root,N,N,-1,Low,-1,-1,Low,-1,-1,High,-1,-1,-1,Low,
N,chess,7,Inner,N,N,-1,Low,-1,-1,Low,-1,-1,High,-1,-1,-1,Low,
N,chess,8,Inner,N,N,-1,Low,-1,-1,Low,-1,-1,High,-1,-1,-1,Low,
N,chess,1,Leaf,NU,N,-1,Low,-1,-1,Low,-1,-1,Medium,Medium,-1,High,High,
N,chess,2,Leaf,NU,N,-1,Low,-1,-1,Low,-1,-1,Medium,Medium,-1,High,High,
N,chess,3,Leaf,NU,N,-1,Low,-1,-1,Low,-1,-1,Medium,Medium,-1,High,High,
N,chess,4,Root,NU,N,-1,Low,-1,-1,Low,-1,-1,Medium,Medium,-1,High,High,
N,chess,5,Isolated,NU,N,-1,Low,-1,-1,Low,-1,-1,Medium,Medium,-1,High,High,
T,chess,6,Inner,TU,T,Low,-1,-1,Low,-1,-1,Medium,-1,Medium,High,-1,High,
T,chess,7,Isolated,TU,T,Low,-1,-1,Low,-1,-1,Medium,-1,Medium,High,-1,High,
N,chess,8,Inner,NU,N,-1,Low,-1,-1,Low,-1,-1,Medium,Medium,-1,High,High,
N,chess,1,Inner,TNU,N,-1,Low,-1,-1,-1,-1,Low,Low,High,Medium,-1,Medium,
N,chess,2,Inner,NU,N,-1,Low,-1,-1,-1,-1,-1,Medium,High,Low,Low,Medium,
N,chess,3,Inner,NU,N,-1,Low,-1,-1,-1,-1,-1,Medium,High,-1,Medium,Medium,
T,chess,4,Inner,NU,N,-1,Low,-1,-1,-1,-1,-1,Medium,High,Low,Low,Medium,
N,chess,5,Leaf,NU,N,-1,Low,-1,-1,-1,-1,-1,Medium,High,-1,Medium,Medium,

这是我用于应用随机森林的代码:

import pandas as pd
import numpy as np
from sklearn.feature_selection import SelectFromModel
from sklearn.model_selection import train_test_split
# Feature Scaling
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

X_train={}
X_test={}
y_train={}
y_test={}
dataset = pd.read_csv( 'dataExtended2.txt', sep= ',') 
    #convert T into 1 and N into 0
dataset['gold'] = dataset['gold'].astype('category').cat.codes
dataset['Program'] = dataset['Program'].astype('category').cat.codes
dataset['MethodType'] = dataset['MethodType'].astype('category').cat.codes
dataset['Top'] = dataset['Top'].astype('category').cat.codes
dataset['Side'] = dataset['Side'].astype('category').cat.codes
dataset['CallersT'] = dataset['CallersT'].astype('category').cat.codes
dataset['CallersN'] = dataset['CallersN'].astype('category').cat.codes
dataset['CallersU'] = dataset['CallersU'].astype('category').cat.codes
dataset['CallersCallersT'] = dataset['CallersCallersT'].astype('category').cat.codes
dataset['CallersCallersN'] = dataset['CallersCallersN'].astype('category').cat.codes
dataset['CallersCallersU'] = dataset['CallersCallersU'].astype('category').cat.codes
dataset['CalleesT'] = dataset['CalleesT'].astype('category').cat.codes
dataset['CalleesN'] = dataset['CalleesN'].astype('category').cat.codes
dataset['CalleesU'] = dataset['CalleesU'].astype('category').cat.codes
dataset['CalleesCalleesT'] = dataset['CalleesCalleesT'].astype('category').cat.codes
dataset['CalleesCalleesN'] = dataset['CalleesCalleesN'].astype('category').cat.codes
dataset['CalleesCalleesU'] = dataset['CalleesCalleesU'].astype('category').cat.codes
pd.set_option('display.max_columns', None)

print(dataset.head())
row_count, column_count = dataset.shape
   
X = dataset.iloc[:, 1:column_count].values
y = dataset.iloc[:, 0].values
Xcol = dataset.iloc[:, 1:column_count]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
   
sc = StandardScaler()
X_train = sc.fit_transform(X_train)

我执行代码的最后一行( ValueError: could not convert string to float: 'Isolated')时收到错误: X_train = sc.fit_transform(X_train)，尽管我使用的是代码行: dataset['MethodType'] = dataset['MethodType'].astype('category').cat.codes将 MethodType从字符串转换为float。我怎样才能解决这个问题？
这是错误的回溯:

Traceback (most recent call last):

  File "<ipython-input-38-d7fe5c294c10>", line 1, in <module>
    runfile('C:/Users/mouna/ownCloud/Mouna Hammoudi/dumps/Python/RandomForestSimplified.py', wdir='C:/Users/mouna/ownCloud/Mouna Hammoudi/dumps/Python')

  File "C:\Users\mouna\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 668, in runfile
    execfile(filename, namespace)

  File "C:\Users\mouna\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/mouna/ownCloud/Mouna Hammoudi/dumps/Python/RandomForestSimplified.py", line 43, in <module>
    X_train = sc.fit_transform(X_train)

  File "C:\Users\mouna\Anaconda3\lib\site-packages\sklearn\base.py", line 517, in fit_transform
    return self.fit(X, **fit_params).transform(X)

  File "C:\Users\mouna\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py", line 590, in fit
    return self.partial_fit(X, y)

  File "C:\Users\mouna\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py", line 612, in partial_fit
    warn_on_dtype=True, estimator=self, dtype=FLOAT_DTYPES)

  File "C:\Users\mouna\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 433, in check_array
    array = np.array(array, dtype=dtype, order=order, copy=copy)

ValueError: could not convert string to float: 'Isolated'

最佳答案

确定当您查看代码的输出(print(dataset.head()))时，您会看到第一列“gold”，但这仍然是一个字符串。发生这种情况是因为pandas将第一列用作索引。

     gold  Program Requirement  MethodType  Top  Side  CallersT  CallersN  \
T     0        0       Inner           2    1     1         0         0
N     0        1       Inner           0    0     0         1         0
N     0        2       Inner           0    0     0         1         0
N     0        3        Root           0    0     0         1         0
N     0        4       Inner           0    0     0         1         0

   CallersU  CallersCallersT  CallersCallersN  CallersCallersU  CalleesT  \
T         1                0                0                1         0
N         0                1                0                0         1
N         0                1                0                0         1
N         0                1                0                0         1
N         0                1                0                0         1

   CalleesN  CalleesU  CalleesCalleesT  CalleesCalleesN  CalleesCalleesU
T         0         0                0                1               -1
N         0         0                0                1               -1
N         0         0                0                1               -1
N         0         0                0                1               -1
N         0         0                0                1               -1

解:

dataset = pd.read_csv( 'dataExtended2.txt', sep= ',', index_col=False)

然后输出将是:

  gold  Program  Requirement  MethodType  Top  Side  CallersT  CallersN  \
0     1        0            1           0    2     1         1         0
1     0        0            2           0    0     0         0         1
2     0        0            3           0    0     0         0         1
3     0        0            4           3    0     0         0         1
4     0        0            5           0    0     0         0         1

   CallersU  CallersCallersT  CallersCallersN  CallersCallersU  CalleesT  \
0         0                1                0                0         1
1         0                0                1                0         0
2         0                0                1                0         0
3         0                0                1                0         0
4         0                0                1                0         0

   CalleesN  CalleesU  CalleesCalleesT  CalleesCalleesN  CalleesCalleesU
0         0         0                0                0                1
1         1         0                0                0                1
2         1         0                0                0                1
3         1         0                0                0                1
4         1         0                0                0                1

Pandas https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html的csv导入文档中的更多详细信息

关于python - Python : ValueError: could not convert string to float: 'Isolated' when reading input file for applying random forest，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62559766/

文章推荐： c++ - 在MacOS上，简单pybind示例的编译失败

文章推荐： c - 使用C返回数组

JavaScript 函数 : Applying Apply
我被这种奇怪的事情难住了。假设我有这个数组: var array = [{ something: 'special' }, 'and', 'a', 'bunch', 'of', 'paramet
javascript - 为什么调用 Function.apply.bind(fn, null) 调用 `fn.apply` 而不是 `Function.apply` ？
假设我们有这样的代码: let fn1 = Function.apply.bind(Math.max, null); fn1([1, 10, 5]); // returns 10 我知道它是 ES6
javascript - Knockout.js 问题 : "h.apply is not a function. (In ' h. apply(e,r )', ' h.apply' 未定义)"
所以我尝试通过数据绑定(bind)调用我的 viewModel 原型(prototype)上的方法。我通过“单击”将两个不同的元素数据绑定(bind)到同一方法。当我单击第一个按钮(“新游戏”按钮)时
scala - 为什么我不能在Scala的this.apply(_)中省略 “apply”？
观察以下代码 trait Example { type O def apply(o: O) def f(o: O) = this.apply(o) } 在Scala中编译良好。我希望我可以
coq - 如何一起使用 'apply ... with'和 'apply ... in'？
我知道 apply f in H 可用于将假设应用于函数，并且我知道 apply f with a b c 可用于提供参数直接应用 f 时，它无法自行推断。是否可以以某种方式将两者结合使用？最佳答
Scala:尝试重载案例类 apply 方法时，apply 方法被定义了两次
这个问题已经有答案了: How to override apply in a case class companion (10 个回答) 已关闭 6 年前。我正在尝试重载案例类的 apply 方法:
grails - 如何从自定义Grails配置文件生成 “apply from”而不是 “apply plugin”？
我有一个自定义的Grails 4.x配置文件。我想为我的应用程序生成一个“apply from”条目。 apply from:"${rootProject.projectDir}/gradle/clo
javascript - this.constructor.apply 与 this.parent.apply
传统上对象继承如下所示: function Parent() { console.log('parent constructor'); } Parent.prototype.method = f
javascript - Function.prototype.apply.apply - 为什么调用它两次
今天在检查Jasmine 的源代码时here我偶然发现了以下内容: if (queueableFn.timeout) { timeoutId = Function.prototype.appl
javascript - 当新建一个包含 .apply 的函数时，.apply 如何工作？
据我所知，关键字new会使用this创建一个包含函数中定义的属性的对象。但我不知道如何应用使用 apply 将其他函数链接到该函数。并且创建的对象在这些函数中具有属性。有人能弄清楚代码中发生了什么吗
javascript - Apply {} 和 Apply {items :. ..} 之间的区别？
我一直在我的 InitComponent 中使用 Ext.Apply，就像这样 Ext.apply(that, { xtype: 'form', items: [.
git apply --reject 与 git apply --3way
我们有数百个存储库，并定期从上游接收补丁。作业应用这些补丁 git apply --check .如果没有错误，则应用补丁 git apply 并且更改已提交。如果有任何错误，补丁将标记为 conf
javascript - Function.apply 与 Function.prototype.apply
我最近通过调用 console.log.toString() 查看了 firebugs console.log 的代码并得到了这个: function () { return Function.app
angularjs - $scope.apply(); 之间的差异；和 $scope.apply(function(){});
拿这个代码: $scope.$apply(function(){ $scope.foo = 'test'; }); 对比这个: $scope.foo = 'test'; $scope.$app
sql - 与 `CROSS APPLY` 和 `OUTER APPLY` 不一致的行为
我在 Oracle-12c 中有一个类似于典型论坛的架构 accounts , posts , comments .我正在编写一个查询来获取... 一位用户该用户的所有帖子对每个帖子的评论以及每
angularjs - Angular $scope.$apply 与 $timeout 作为安全的 $apply
我试图更好地理解在 Angular 中使用 $timeout 服务作为一种“安全 $apply”方法的细微差别。基本上在一段代码可以运行以响应 Angular 事件或非 Angular 事件(例如 j
r - 批量预测；使用 apply() 函数而不是 for 循环。 apply() 函数给出不同点的预测
到目前为止，我使用的是 this当我有多个时间序列要预测时，我使用了 Hyndman 教授的方法。但是当我有大量的 ts 时它相当慢。现在我正在尝试使用 apply() 函数，如下所示 librar
python Pandas : can we avoid apply in this case of groupby/apply?
我听说过很多关于 pandas apply 很慢的说法，应该尽可能少用。我这里有个情况: df = pd.DataFrame({'Date': ['2019-01-02', '2019-01-03'
javascript - 在 apply 的重新声明中调用 Function.prototype.apply (Javascript)
在学习Javascript时，我尝试重新声明函数的apply属性。到目前为止没有问题。 function foo() { return 1; } alert(foo()); // 1 alert(fo
javascript - Apply.prototype.push.apply 与 forEach 对于嵌套数组？
所以我正在做 learnRx http://reactive-extensions.github.io/learnrx/我有一个关于制作 mergeAll() 函数的问题(问题 10)。这是我的答案

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - Python : ValueError: could not convert string to float: 'Isolated' when reading input file for applying random forest