gpt4 book ai didi

Python ValueError : ColumnTransformer, 列顺序不相等

转载 作者:行者123 更新时间:2023-12-05 02:55:47 24 4
gpt4 key购买 nike

我将读取 csv、训练模型和预测请求数据的以下函数放在一起。

我有以下 ValueError:使用剩余关键字时,列顺序必须相等才能适合和转换

训练数据和用于预测的数据具有完全相同的列数,例如 15。我不确定列的“排序”如何改变。

~/.local/lib/python3.5/site-packages/sklearn/pipeline.py in predict(self, X, **predict_params)
417 Xt = X
418 for _, name, transform in self._iter(with_final=False):
--> 419 Xt = transform.transform(Xt)
420 return self.steps[-1][-1].predict(Xt, **predict_params)
421

~/.local/lib/python3.5/site-packages/sklearn/compose/_column_transformer.py in transform(self, X)
581 if (n_cols_transform >= n_cols_fit and
582 any(X.columns[:n_cols_fit] != self._df_columns)):
--> 583 raise ValueError('Column ordering must be equal for fit '
584 'and for transform when using the '
585 'remainder keyword')

ValueError: Column ordering must be equal for fit and for transform when using the remainder keyword

功能:

numeric_transformer = Pipeline(steps=[

('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler())])

categorical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
('onehot', OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])

#Putting data transformation and the model in a pipeline
rf = Pipeline(steps=[('preprocessor', preprocessor),
('classifier', RandomForestClassifier(
n_estimators=500,
criterion="gini",
max_features="sqrt",
min_samples_leaf=4))])

rf.fit(X_train, y_train)

request_data = {'A': [request.A],
'B': [request.B],
'C': [request.C],
'D': [request.D],
'E': [request.E],
'F': [request.F],
'G': [request.G],
'H': [request.H],
'I': [request.I],
'J': [request.J],
'K': [request.K],
'L': [request.L],
'M': [request.M],
'N': [request.N],
'O': [request.O]}

df_resp = pd.DataFrame(data=request_data)
response = rf.predict(df_resp)

output = {"Safety Rating": response[0]}

return output

最佳答案

我从错误消息中了解到,X_train.columnsdf_resp.columns 不一样,但是.predict() 需要

为了强制实现这种相等性,您可以在创建数据帧时将 X_train 的列列表作为参数传递:

pd.DataFrame(data=request_data, columns=X_train.columns)

关于Python ValueError : ColumnTransformer, 列顺序不相等,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61001934/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com