gpt4 book ai didi

python - 使用管道作为估计器的 VotingClassifier

转载 作者:行者123 更新时间:2023-12-04 11:23:55 25 4
gpt4 key购买 nike

我想建立一个sklearn VotingClassifier集成多个不同模型(决策树、SVC 和 Keras 网络)。它们都需要不同类型的数据预处理,这就是我为它们每个都制作了管道的原因。

# Define pipelines

# DTC pipeline
featuriser = Featuriser()
dtc = DecisionTreeClassifier()
dtc_pipe = Pipeline([('featuriser',featuriser),('dtc',dtc)])

# SVC pipeline
scaler = TimeSeriesScalerMeanVariance(kind='constant')
flattener = Flattener()
svc = SVC(C = 100, gamma = 0.001, kernel='rbf')
svc_pipe = Pipeline([('scaler', scaler),('flattener', flattener), ('svc', svc)])

# Keras pipeline
cnn = KerasClassifier(build_fn=get_model())
cnn_pipe = Pipeline([('scaler',scaler),('cnn',cnn)])

# Make an ensemble
ensemble = VotingClassifier(estimators=[('dtc', dtc_pipe),
('svc', svc_pipe),
('cnn', cnn_pipe)],
voting='hard')

Featuriser , TimeSeriesScalerMeanVarianceFlattener类是一些定制的转换器,它们都使用 fit , transformfit_transform方法。

当我尝试 ensemble.fit(X, y)适合整个合奏我收到错误消息:

ValueError: The estimator list should be a classifier.



我可以理解,因为单个估计器不是专门的分类器,而是管道。有没有办法让它继续工作?

最佳答案

问题出在 KerasClassifier .它不提供 _estimator_type ,已 checkin _validate_estimator .

这不是使用管道的问题。管道将此信息作为属性提供。见 here .

因此,快速修复是设置 _estimator_type='classifier' .

一个可重现的例子:

# Define pipelines
from sklearn.pipeline import Pipeline
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.preprocessing import MinMaxScaler, Normalizer
from sklearn.ensemble import VotingClassifier
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.datasets import make_classification
from keras.layers import Dense
from keras.models import Sequential

X, y = make_classification()

# DTC pipeline
featuriser = MinMaxScaler()
dtc = DecisionTreeClassifier()
dtc_pipe = Pipeline([('featuriser', featuriser), ('dtc', dtc)])

# SVC pipeline
scaler = Normalizer()
svc = SVC(C=100, gamma=0.001, kernel='rbf')
svc_pipe = Pipeline(
[('scaler', scaler), ('svc', svc)])

# Keras pipeline
def get_model():
# create model
model = Sequential()
model.add(Dense(10, input_dim=20, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model


cnn = KerasClassifier(build_fn=get_model)
cnn._estimator_type = "classifier"
cnn_pipe = Pipeline([('scaler', scaler), ('cnn', cnn)])


# Make an ensemble
ensemble = VotingClassifier(estimators=[('dtc', dtc_pipe),
('svc', svc_pipe),
('cnn', cnn_pipe)],
voting='hard')

ensemble.fit(X, y)

关于python - 使用管道作为估计器的 VotingClassifier,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59897096/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com